2010. december 19., vasárnap

Porting URBI 2.x for AIBO

I have not been written any blog entry quite long time ago. My research project, AiBO+, was not dead, but I worked hard to update the URBI 1.5 to a 2.x version. The work is not done yet, but I won 3rd price on the Gostai Open Source Contest 2010 with this subproject.

I have performance problems with Urbi 2 on Aibo; the more extensive Urbi script usage in the core of the Urbi 2 seems to slow down everything to unusable. Their plans are to move more and more scripts instead of native implementation, but it kills the performance for embedded robot systems like Aibo. If they will re-implement some performance critical parts in native C++, then the upgrade will be feasible for Aibo.

So was it a waste of time for me to work on the Urbi 2 port? No, I have great achievements like upgrading the Aibo toolchain to gcc 4.x, aggressive compiler optimisations for final binary programs for Aibo and reimplemented linking tools in Open-R. These advances push out almost everything from the RM-7000 CPU in ERS-7. Many times have been spent on these tasks. A great thing would be to upgrade the toolchain to gcc 4.4.x or later to have the MIPS specific PLT optimisations, which would boost the performance a bit more and reduce the final binary size, but I think this task is almost impossible to do without any help from Sony. I have not got any information for my current efforts. I did everything on my own and the work was hard regarding that Aibo does not have a Unix-like system, but a proprietary black-box.

The future is to re-implement some low-level functions to replace the Urbi 1.5 completely and start to implement locomotion functions from zero. I have not found too much reusable open-source software for this, although I will use, if there is something useful. Aibo joints have not been moved since half-year ago, so I am very keen to implement these functions. Aibo movements are hard-coded in the official Sony Mind software and in Urbi, so it will be very interesting to implement such movements affected by the forces in the joints and the environment objects.

Now, the fetching of the sensor data, images are done and I am able to ping my native AiBO+ server on the robot. An other cool thing that the WiFi LED shows if the AiBO+ client is connected to the robot and the LEDs on the back of the robot show the battery status. My general plan with the client-server architecture changed a bit and I will implement many functions on the robot, so if there is no connection to a computer client, only some heavy function will not be available, but the locomotion and other lower level function will work.

Let's see the future!

2010. június 5., szombat

A kertépítés eredménye

Sok hétnyi megfeszített munka után az elmúlt héten feltettem az i-re a pontot a kertben, és csináltam egy bambuszcsobogót. A csobogó mögött egy japán juhar (Acer Palmatum Garnet) lett ültetve, reméljük, hogy megéli a következő évet. Nem valami életerős példányt küldtek a kertészetből, ahonnan neten vettem, de ez lehet, hogy alapból ilyen alfaj. Meglátjuk:



Itt van még néhány kép a többi részről:


Ez az elülső kertben ültetett japán juhar (Acer Palmatum Atropurpurea).


Ez egy kicsit másik nézet ugyanarról a részről. A kép közepefelé egy nagy bokor volt, amit vagy 2-3 hete műtöttem ki, egy egészestés kaland keretében.


A hátsó kert nagyobb látószögből.


Ez megint a hátsó kert egy kicsit más szögből.


A hátsó kert mögötti sövény régen a civilizáció határa volt. Mögötte az évek során felhalmozódott dzsuva. Elég sok erőfeszítésbe telt, de most már kezd normálisan kinézni.


A nagy meténgek (Vinca Major) kezdeti telepei hátul. Meglátjuk, mennyi marad meg jövőre.


Hátul ki lehet menni egy buszmegálló felé, és az esős időben sáros, fűmentes út eléggé kényelmetlen volt. Most egy ad-hoc jelleggel kialakított kavicsos út vezet már hátrafelé.


A kert egyik másik üde színfoltja egy mini sziklakert. A sziklák már adottak voltak, csak a koncepciót kellett közéjük megalkotni.

Egyenlőre ennyi, és inkább nem mondom, hogy ezeket kialakítani milyen hosszú, és drága volt, mert mindenki szívbajt kap. A kert Zsuzsa és az én közös munkánk. :)

2010. május 27., csütörtök

AiBO+ 1.0 is released :)

It is a first step of a long road, but it is taken now, the AiBO+ 1.0 is released with a very simple behaviors for AIBO.

The home page is updated with fresh, new documentation: http://aiboplus.sf.net
Windows binary installer can be downloaded from: http://sourceforge.net/projects/aiboplus/files/Windows binaries
Ubuntu PPA for Jaunty, Karmic, Lucid: https://launchpad.net/~csaba-kertesz/+archive/aiboplus

2010. május 6., csütörtök

Layouts in Qt

It is not that easy to understand the layout concepts in Qt compared to GTK. A simple window has been driven me crazy, but eventually, I found the following web page: http://www.embrisk.com/notes/qt_resize.html and it has saved my life. :)

2010. május 4., kedd

Linking Qt under MinGW

I think earlier I forgot to write a blog post about problems of linking with Qt under MinGW/Windows. I experienced problems with linking Qt libraries and the solution was to check the linker options of the Qt libraries during compilation. Adding the following linker options to my project, solved the problem:

-enable-stdcall-fixup -Wl,--enable-auto-import,-enable-runtime-pseudo-reloc

2010. április 21., szerda

Agner Fog's asmlib optimization of memcpy - A short study

In my software, because of the data flows in the system and image processing operations, large segments of memory can be allocated/deallocated/copied. It is a straightforward direction to find ways to optimize these operations in the system. Earlier, I integrated the Google's TCMalloc (malloc replacement) to speed up the memory allocations and it was efficient to reduce the run-time by 10 %.
Agner Fog's e-books were found by me on the internet and I got excited about his asmlib library where he implemented memory optimizations in assembly. A table suggests in his books that his implementations are very efficient. I am a sceptic-type person, so I did some measurements under various circumstances. My simple program, what I executed, did random memory copies into two character arrays. In the aligned tests, the program copied 10000 bytes at a time and in the unaligned tests, 9999 bytes. The program below was compiled with varied gcc optimization flags and executed with the time program to measure the run-time:

int main(int, char**)
{
char a1[1100000];
char a2[10000];

srand(time(NULL));

for (int i = 0; i < 1000000; i++)
memcpy(&a1[rand() % 1000000], a2, 10000);
}

In the columns of the measurements:

1. Without asmlib, without -fno-builtin flag to gcc.
2. Without asmlib, with -fno-builtin flag to gcc.
3. With asmlib, without -fno-builtin flag to gcc.
4. With asmlib, with -fno-builtin flag to gcc.

The -fno-builtin flag means to disable the gcc built-in version of the memory operations. The built-in version can be optimized by the gcc in higher optimization levels if possible.

A) CPU: 32 bit, Operating system: 32 bit
Details: AMD Sempron 2800+ (Thoroughbred-B, 2000 Mhz), Windows XP Home (MinGW, gcc 4.4.1).

Aligned tests:

gcc -O0: 2.7 s, 2.7 s; 2.7 s, 2.7 s
gcc -O1: 2.7 s, 2.7 s; 2.7 s, 2.7 s
gcc -O2: 0.03 s, 2.7 s; 0.03 s, 2.7 s
gcc -O3: 0.03 s, 2.7 s; 0.03 s, 2.7 s
gcc -Os: 0.03 s, 2.7 s; 0.03 s, 2.7 s

Unaligned tests:

gcc -O0: 2.8 s, 2.8 s; 2.8 s, 2.7 s
gcc -O1: 8.2 s, 2.8 s; 8.2 s, 2.7 s
gcc -O2: 8.2 s, 2.8 s; 8.2 s, 2.7 s
gcc -O3: 8.2 s, 2.8 s; 8.2 s, 2.7 s
gcc -Os: 8.2 s, 2.8 s; 8.2 s, 2.7 s

For this kind of CPU, the recommendation:
- If a program has many aligned memory copies, the -O2 and -O3 are adviced without -fno-builtin.
- If a program has many unaligned memory copies, the -fno-builtin flag is advised.
- asmlib does not give more performance for memcpy.

B) CPU: 32 bit, Operating system: 32 bit
Details: Intel Core Duo (U2500, 1200 Mhz), andLinux (coLinux with Ubuntu Jaunty, gcc 4.3.3, libc6 2.9) under Windows XP Home.

Aligned tests:
gcc -O0: 6.5 s, 6.0 s; 4.2 s, 4.1 s
gcc -O1: 6.7 s, 6.7 s; 4.1 s, 4.1 s
gcc -O2: 6.8 s, 6.7 s; 4.1 s, 4.1 s
gcc -O3: 6.7 s, 6.7 s; 4.1 s, 4.1 s
gcc -Os: 6.0 s, 6.0 s; 6.0 s, 6.0 s

Unaligned tests:

gcc -O0: 6.5 s
, 6.5 s; 4.2 s, 4.1 s
gcc -O1: 6.8 s, 6.8 s; 4.1 s, 4.1 s
gcc -O2: 6.8 s, 6.8 s; 4.1 s, 4.1 s
gcc -O3: 6.8 s, 6.8 s; 4.1 s, 4.1 s
gcc -Os: 18.0 s, 18.0 s; 18.0 s, 18.0 s

The coLinux is a "native" Linux environment under Windows. The gcc can not optimize the built-in memory copy operator and the asmlib does a decent job, only the -Os option should be avoided for unaligned copies. The asmlib can reduce the run-time by ~38 %, it is recommended to use with or without -fno-builtin.

C) CPU: 32 bit, Operating system: 32 bit
Details: Intel Core Duo (U2500, 1200 Mhz), Windows XP Home (MinGW, gcc 4.4.1).

Aligned tests:

gcc -O0: 3.6 s, 5.0 s; 3.6 s, 3.5 s
gcc -O1: 5.0 s, 5.7 s; 5.0 s, 3.5 s
gcc -O2: 0.03 s, 5.7 s; 0.03 s, 3.5 s
gcc -O3: 0.03 s, 5.7 s; 0.03 s, 3.5 s
gcc -Os: 0.03 s, 5.7 s; 0.03 s, 3.5 s

Unaligned tests:

gcc -O0: 5.2 s, 5.3 s; 3.6 s, 3.6 s
gcc -O1: 15.0 s, 5.8 s; 15.0 s, 3.6 s
gcc -O2: 15.0 s, 5.8 s; 15.0 s, 3.6 s
gcc -O3: 15.0 s, 5.8 s; 15.0 s, 3.6 s
gcc -Os: 15.0 s, 5.8 s; 15.0 s, 3.6 s

The optimized aligned copies are so fast as AMD Sempron, but the latter is much more better in other cases. It can be the effect of the raw extra frequency. If the case is other than the aligned memory copies with higher optimization flags, the linking with asmlib is adviced with -fno-builtin operator.

D) CPU: 64 bit, Operating system: 64 bit
Details: Intel Core2 Duo (P9300, 2260 Mhz), Ubuntu Karmic, gcc 4.4.1, libc6 2.10.

Aligned tests:

gcc -O0: 1.8 s, 1.8 s; 0.8 s, 0.8 s
gcc -O1: 1.8 s, 1.8 s; 0.8 s, 0.8 s
gcc -O2: 0.01 s, 0.01 s; 0.01 s, 0.01 s
gcc -O3: 0.01 s, 0.01 s; 0.01 s, 0.01 s
gcc -Os: 0.01 s, 0.01 s; 0.01 s, 0.01 s

Unaligned tests:

gcc -O0: 1.8 s, 1.8 s; 0.8 s, 0.8 s
gcc -O1: 1.8 s, 1.8 s; 0.8 s, 0.8 s
gcc -O2: 1.8 s, 1.8 s; 0.8 s, 0.8 s
gcc -O3: 1.8 s, 1.8 s; 0.8 s, 0.8 s
gcc -Os: 5.0 s, 5.0 s; 5.0 s, 5.0 s

Well, the results are impressive. It is interesting to see that the gcc forced the internal memory operations (?) with -fno-builtin option with -O2,-O3 and the aligned copies were highly optimized under every circumstances. The asmlib optimizes more than 50 % in the unaligned tests and aligned tests with lower optimization. It is highly adviced to link with asmlib for this processor.

Conclusion

Many conditions effect the possible optimizations of the code: CPU, operating system, compiler. There are no universal solutions and the situation can be complicated by varying the optimization flags for each source files or override the memcpy operator only in some source files for faster unaligned copies. The right code starts from the choice of the right algorithm and the optimization should be just a fine tuning for the final program.

2010. április 9., péntek

Ubuntu experiences

Well, I realized after onehalf month of trials that it is quite hard to get involved with Ubuntu. I think I am a professional and I understand that I should prove this to get some access to the community maintained repositories, but I only would like to maintain/provide fixes to some software. The way to provide fixes, just does not work for me. Everybody is very busy there, it is hard to push fixes forward. At least, this is my experience and I have more important things than struggle with these issues now. Seems to me the best way for the required backports and fixes, if I maintain that software stack in my PPA.

A good example is the google-perftools+libunwind. The google-perftools is broken for amd64 architecture since (at least) Jaunty. Debian integrated newer packages and they come into Lucid, but

https://bugs.launchpad.net/ubuntu/+source/libunwind/+bug/522106

If somebody reads the bug with the linked bugs from other sources it is clear that it is easy to fix this issue: disable the setjmp library completely or only for amd64 where it does not compile. The bug is untouched more than one month ago. I have two choices:
1. Implement this quick fix and drop the packages in my PPA for Lucid and older versions (Karmic, Jaunty...) -> A few hours maximum.
2. Implement this quick fix and start the process to get it in Lucid and start an other process to get the packages into backports. -> Who knows how long time...?

A quick fix and advertising my PPA to the interested people seems to me much more easier. And I am not talking about putting new (already packaged) software to Ubuntu what is a nightmare compared to this.

Note that my thoughts can not be applied to crucial and complex packages (e.g. OpenOffice or GTK).

Előkerült a bicaj.

A tegnapi napon előkaptam a bicajomat a tárolóból, és azzal jöttem a munkába. Mit ne mondjak, jól éreztem utána magam, hogy vége a téli henyélésemnek; hiányzott már a cangázás. Üröm, az örömben, hogy a hátsó abroncs elkezdett ereszteni, amit nem értek, mert elvileg volt szervízben az elmúlt ősszel, és új az abroncs van benne. Amikor kivettem, semmi baja nem volt, fel volt fújva rendesen, de bicajozás után délutánra leeresztett.

2010. március 28., vasárnap

Olvad a hó...

Végre már erőteljesebben elkezdett olvadni a hó, és a hőmérséklet is tartósan 0 fok felett van. :)

2010. március 20., szombat

My PPA

I am happy to announce the first uploaded packages to my personal Ubuntu PPA. First the Sony Open-R SDK for AIBO, the Urbi Engine SDK for AIBO 1.5 and the AiBO+ memory stick have been packaged. The archive is available for Lucid, Karmic, Jaunty.

I hope it will be useful for some people and I can soon push the packages to the Ubuntu repositories.

The address with technical info:

https://launchpad.net/~csaba-kertesz/+archive/aiboplus

2010. március 4., csütörtök

Megkeresések

Az elmúlt egy hét alatt három cégtől is megkerestek, hogy nem lenne-e kedvem náluk dolgozni. Sajna, meg kellet mondjam, hogy most nagyon elégedett vagyok a munkahelyemmel és nem akarok váltani. :)

2010. február 28., vasárnap

Saját zsemle

Ezeket a túrós zsemléket én készítettem, életem első alkotása a kenyér/zsemle/kifli univerzumban:

2010. február 23., kedd

Embedded QDialog in QGraphicsScene lost their window decorations in Qt 4.6

I discovered a problem after upgrading to Qt 4.6.x. My debug dialogs embedded into a graphics view/scene lost their window decorations. After a bit digging, I could find out that the problem was that I set the appropriate window flags of the QDialog, but left the default flags for the QGraphicsProxyWidget. Somehow the thing worked in Qt 4.4.x/4.5.x, did not work in Qt 4.6.x. The solution was to pass the same window flags to the proxy widget as well.

2010. február 19., péntek

Bug fixing in ubufox

I started to work on ubufox (Ubuntu Extension for Firefox) to do various bug fixes to get positive feedback for the Ubuntu developer application.

2010. február 18., csütörtök

Facing problem with Qt 4.6

My debug window became broken with Qt 4.6, at least the window decorations of the QDialogs embedded in a graphics scene were missing. The same application works with Qt 4.4.x and 4.5.x. I will investigate it if it is a bug in Qt or just a changed behavior.

2010. február 17., szerda

Becoming an Ubuntu Developer+documentation

My main focus on the AiBO+ is writing documentation before doing the release 1.0. I really believe that it is a major milestone for the project, even if the AI achievements are quite small now. My feeling tells me that the project is on the right track and progress will be more visible in this year.

My plan is to release the software under Ubuntu therefore I should become an Ubuntu Developer. It is not an easy and fast process, but I will fix some bugs in ubufox package and hopefully earn some recommendations to the Council and then they can approve my application. Probably, AiBO+ will loose the chance to get in Lucid because it is too late for that already.

I am already excited about the future, I would like to go to a conference with a paper describes the latest developments in the project. I migrated also the Qt's State Machine Framework which is available only in the newest Qt 4.6. I think this approach can simplify and enhance my efforts on the implementation of the different states of AIBO.

Let's see.

Körömpörkölt+digitális tévé

A múlt héten tudtam venni pár disznólábat, úgyhogy csináltam a hétvégén körömpörköltöt, aminek a nagyrészét gyorsan bevágtam a mélyhűtőbe, hogy időről-időre ehessek eme csemegét, a többi meg ment pocimba. :)

A tegnapi estém meg azzal ment el, hogy felszereltem egy külső antennát az erkélyre a digitális adás fogására, mert a beltéri antennánk néha nem működött elég jól (völgyben lakunk meg a sugárzó adóhoz képest átellenes irányba néz a szoba).

Az asszonynak mostanában nagyon jó dolga van, mivel tegnap is vigyáztam a mindenkire, még ő jógára ment, szombaton meg egész nap vigyáztam az Ajnára, míg ő továbbképzésen tágította az agyát. Erre most meg csajpartyja lenne ma este, még meglátom, hogy elengedem-e. :)

2010. február 13., szombat

Csülökpörkölt

Imigyen készült a körömpörköltnek indult csülökpörkölt: