2019. május 19., vasárnap

Limit the CPU maximum frequency to avoid thermal shutdowns under Ubuntu 18.04

I used to have similar problems for multiple laptops. It seems that the CPU in the laptops tends to overheat more easily over time and shut down. Replacing the CPU fan and quality thermal paste never helped me in these situations. So far I limited the max frequency on Ubuntu, but it might happen that you just leave your laptop while doing some processing for a moment under the Sun and it just overheats the whole laptop body, causing a shut down eventually.
I learned that the newest laptops with Intel chips don't work with cpufreq-set properly, but only with likwid tools.
Installing this package:
sudo apt install likwid
I wrote the following python script to decrease/increase the max CPU frequency (manipulate_cpu_freq.py) under Ubuntu 18.04 (requires Python 3.7):
#!/usr/bin/python3.7

import argparse
import os
import subprocess

parser = argparse.ArgumentParser(description = "Manipulate CPU frequencies", prefix_chars = '-')
parser.add_argument("-d", "--decrease", help = "decrease the max frequency", type = bool, default = False)
parser.add_argument("-i", "--increase", help = "increase the max frequency", type = bool, default = False)
parser.add_argument("-s", "--silent", help = "silent mode", type = bool, default = False)
args = parser.parse_args()

query_freqs_output = subprocess.run(["likwid-setFrequencies", "-l"], capture_output = True)
query_freqs_output = query_freqs_output.stdout.decode('utf-8').split('\n')[1]
query_freqs_output = query_freqs_output.split(' ')
available_freqs = list(map(float, query_freqs_output))

query_curr_freq_output = subprocess.run(["likwid-setFrequencies", "-p"], capture_output = True)
query_curr_freq_output = query_curr_freq_output.stdout.decode('utf-8').split('\n')[1]
query_curr_freq_output = query_curr_freq_output.split('/')[-1]
current_freq = float(query_curr_freq_output.split(' ')[0])
curr_freq_index = min(range(len(available_freqs)), key = lambda i: abs(available_freqs[i]-current_freq))

if not args.silent:
  print("Available frequencies:", available_freqs)
  print("Current frequency:", current_freq)

if args.decrease:
  print("Decrease the frequency")
  if curr_freq_index == 0:
    print("Warning: Can't decrease the frequency because it is already at min")
    exit(1)

  print("Set to frequency", available_freqs[curr_freq_index-1], "Ghz")
  subprocess.run(["likwid-setFrequencies", "-y", str(available_freqs[curr_freq_index-1])])
  exit(0)

if args.increase:
  print("Increase the frequency")
  if curr_freq_index == len(available_freqs)-1:
    print("Warning: Can't increase the frequency because it is already at max")
    exit(1)

  print("Set to frequency", available_freqs[curr_freq_index+1], "Ghz")
  subprocess.run(["likwid-setFrequencies", "-y", str(available_freqs[curr_freq_index+1])])
  exit(0)
And I use a script running in the background to monitor the CPU temperature (run_cpu_policy.sh):
#!/bin/bash

while true
do
  CPU_TEMP=$(cat /sys/devices/virtual/thermal/thermal_zone0/temp)
  echo CPU Temperature: $(echo ${CPU_TEMP}/1000 | bc)°C
  if [ "$CPU_TEMP" -gt 76000 ]; then
    echo Decrease the max CPU frequency
    sudo manipulate_cpu_freq.py -s 1 -d 1
  fi
  if [ "$CPU_TEMP" -le 68000 ]; then
    echo Increase the max CPU frequency
    sudo manipulate_cpu_freq.py -s 1 -i 1
  fi
  sleep 10
done
Surely, you must check which sys point (e.g. /sys/devices/virtual/thermal/thermal_zone0/temp) contains your CPU temperature and adapt the script above. I increase the CPU max frequency when the temperature is below 68°C and decrease if it is above 76°C. It is very conservative policy, but the temperature may reach quickly above 100°C (around thermal shutdown threshold), if it sits above 80°C permanently thus I try to keep always below 80°C, just to be sure.
I had to develop the above solution yesterday because I got two thermal shutdowns because of the sunny, hot day while running intensive computations on my laptop CPU (Intel i7-6600U) continuously.
You can run the script after every startup with adding to the cron jobs (/etc/crontab):
@reboot root systemd-run --scope sudo -u YOUR_USER screen -dmS cpu_policy /home/YOUR_USER/run_cpu_policy.sh
Be sure to have screen installed:
sudo apt install screen
You can check it while running:
screen -r cpu_policy

2019. március 9., szombat

Configuring a USB DisplayLink monitor on Ubuntu 18.04

This blog post is about my experiences of configuring a USB DisplayLink monitor (Mobile Pixels Duex) under Ubuntu Linux 18.04 with KDE

Installation
--------------
Download and execute the installer from:


The installation was very smooth under Ubuntu 18.04. Since the DisplayLink installer compiles a kernel module, be sure that the kernel headers, dkms, make and g++ packages are installed on your system.

Screen setup
------------------
After connecting Duex, it was switched on, but the display was not configured. The display preferences dialog under KDE System Settings was unable to handle the Duex and the laptop screen changed to full resolution by some reason when Duex was connected. xrandr can configure Duex correctly, two commands can do the trick on command line:

- Optional step to reset back the laptop screen resolution if you use other resolution than the native. Example command:
 xrandr --output eDP1 --mode 1600x900 --primary  
- Set Duex to extended display on the right side of the main screen (check "xrandr --help" for other configurations):
 xrandr --output DVI-I-1-1 --right-of eDP1 --mode 1920x1080 --noprimary  

Disconnecting and reconnecting the Duex any time after this point will be just fine, no other xrandr tweaking is needed, but these xrandr changes are not permanent after every reboot they must be repeated once Duex is connected first time. eDP1 and DVI-I-1-1 might be different on your system. You can check the right IDs with a simple "xrandr" command in command line, it will list all available display connection IDs.

Brightness control
------------------
Duex is very bright by default and it is not possible to change the brightness under Linux because of the lack of support in the DisplayLink driver. Because the brightness value is permanent after it is changed on the device, my workaround was to install the DisplayLink driver in a Windows 10 virtual machine, adjust the brightness to your preference with the ScreenBright application and remove Duex from the virtual machine. Since the brightness change is permanent, the setting remains after disconnecting and reconnecting Duex. It is a dirty workaround, but at least you are not locked to a certain brightness level permanently. The same method works just fine without virtual machines by connecting Duex to a real Windows PC or laptop temporary.

Brightness control (in software)
---------------------------------------
Without changing the backlight of Duex with hardware controls, xorg can apply brightness/gamma values in software to correct the screen. This method does not change the backlight level of the Duex, but it would provide a way to finetune the brightness level. This is not available yet, but at least doable. Modesetting driver used by Displaylink added support for software gamma support in Jan 2018, but this change has not landed in Ubuntu yet:


Software gamma support must be also implemented in the open source parts of the DisplayLink driver. The efforts can be tracked via:


Further tweaking
----------------------
If your laptop has an integrated Watcom touchscreen in your screen, the touch will point the mouse pointer to wrong coordinates if Duex is connected because overall display dimensions changed with the extended screen. You have to map your Wacom device back to the laptop screen once Duex is connected or disconnected with xsetwacom:

 xsetwacom set "Wacom Pen and multitouch sensor Finger touch" MapToOutput eDP1  

The correct wacom device ID can be checked with "xinput" on command line and the correct display ID (eDP1) with "xrandr". These steps can be automated when Duex is connected/disconnected with srandrd (https://github.com/jceb/srandrd). The following script run with srandrd works (correct the display/touchscreen IDs if needed):

 #!/bin/bash  
 case "${SRANDRD_OUTPUT} ${SRANDRD_EVENT}" in  
  "DVI-I-1-1 connected") sleep 5 && xsetwacom set "Wacom Pen and multitouch sensor Finger touch" MapToOutput eDP1 ;;  
  "DVI-I-1-1 disconnected") sleep 5 && xsetwacom set "Wacom Pen and multitouch sensor Finger touch" MapToOutput eDP1 ;;   
 esac  


2018. március 26., hétfő

Flickering with Google Chrome on KUbuntu 16.04

There are many flickering issues with Intel GPU and Google Chrome on Linux. I use Ubuntu 16.04 with KDE. I hide the window title bars to get the maximal usable screen space on my laptop. Chrome flickered with some web pages heavily, but only if the window title bar was hidden. I tried a few fixes from askubuntu:

https://askubuntu.com/questions/766725/annoying-flickering-in-16-04-lts-chrome

But eventually it turned out, I have to change the KWin rendering backend to XRender and the flickering stopped without disabling any GPU acceleration.

2017. május 8., hétfő

Zapcc Compiler

Once I noticed that a small startup company (Ceemple) introduced a modified Clang compiler (Zapcc) to reduce the compilation time. Instead of a disk-based cache like ccache, they cache some intermediate compilation pass results in memory (at least that is my bet) and reuse it for later compilation units. I modified the CMake build system in my robotics research project (AiBO+) and measured their claims. I found out that they don't overpromise. My project is about 90 kLOC now and zapcc was 30% faster than vanilla Clang 4.0.

Gcc 5.4: 6:40 min
Clang 4.0: 6 min
Zapcc: 4 min

I highly recommend their product. They don't just promise, but deliver. And it is really a drop-in replacement for gcc/Clang on Linux 64-bit environment.

2016. november 10., csütörtök

When you can use folly::ThreadLocalPtr instead of boost::thread_specific_ptr

In this post, I compare two solutions for thread local storage:

1. boost::thread_specific_ptr in Boost library which provides a cross-platform solution to store data per thread.
URL: http://www.boost.org/

2. Facebook's folly has folly::ThreadLocalPtr which has a very similar interface to boost::thread_specific_ptr for easy adaption and it is claimed to be 4x faster. The disadvantage of folly is that it supports only 64 bit Mac OS X+Linux distributions (e.g no Windows) and it requires a modern C++ compiler with C++11 support.

URL: https://github.com/facebook/folly
URL: https://github.com/facebook/folly/blob/master/folly/docs/ThreadLocal.md

I did not write a synthetic case for performance testing which may justify how faster folly is, but I come up with a real world use case from my research project. I develop an artificial intelligence for the old Sony ERS-7 robot dogs, and as part of my efforts, I can simulate the AI in a thread on my host Linux machine to use multithreaded testing. When I replaced Boost with folly in the testing, I was surprised that it is really faster. Obviously not 4x, but it is still much faster. On the other hand, I saw a significant difference in the folly performance when different compilers were used and decided to create this blog post.

Test case: I run one AI simulation test case in a single thread 100 times and the averaged runtime is shown in the diagrams below. In one place, my code was inefficient, using the thread local pointer (TLP) inside a frequently called loop. Because of this performance bug, the long execution time was mainly caused by this TLP bottleneck. After I moved the TLP usage outside the loop, the performance gains were not relevant with folly anymore. I still think that it has some value to publish these numbers how folly can improve a situation when TLP is heavily used or not.

Compiler setup

Important compiler switches: -O3 -std=gnu++14
I-don't-think-so-relevant compiler switches: -mfpmath=sse -march=core2 -fPIC

Gcc 5.4 came the official Ubuntu Xenial repositories
Gcc 6.2 was installed from a toolchain testing PPA for Ubuntu (https://launchpad.net/~ubuntu-toolchain-r/+archive/ubuntu/test)
Clang 3.9 and 4.0 (4.0~svn286079) were installed from LLVM repositories (http://apt.llvm.org).

Case 1

So in this case, the TLP was used inefficiently, causing a large part of the runtime. As we can see, the the Boost results are almost identical, gcc is a bit faster than clang. However, Folly is not only faster with gcc, but Clang provides a much better performance than gcc. Although gcc 6 improved the performance a bit over gcc 5, but clang 4.0 is still 17.7 % faster than gcc 6.2.


Left axis is execution time in milliseconds. Lower is better.


Case 2

After the TLP usage was fixed in my codes, the compilers delivered almost identical results since the TLP access time did not play such main role like in Case 1. Boost and folly results are quite similar, but clang was a bit faster with Folly by a small margin. gcc 5.4 was unexpectedly faster with Folly, but I would assume it was a coincidence of some optimizations since that compiler was the slowest with folly in Case 1.


Left axis is execution time in milliseconds. Lower is better.


Verdict

No silver bullets here. If you have a program which heavily use thread local pointers under Linux or Mac OS X, it is recommended to try folly to gain some speed. Otherwise Boost provides a generic cross-platform solution when TLP is not used all over the place.

2016. március 11., péntek

Unity Build macro for CMake

A half year ago, I came across some techniques to speed up the C++ compilation for bigger projects. As my evergrowing AiBO+ project is around 87000 C++ source lines without 3rd party libraries now, it really became important to keep the compilation time low. Apart from the ccache integration in my CMake build system, I decided to try out the Unity Build method. When I looked around the internet, I found that the Unity Build would shorten the build duration significantly and some people crafted some small CMake scripts, but these solutions were incomplete. Either they did not handle the Qt's moc files at all or all Unity files were regenerated each time when a CMake reconfigure was initiated. An other example is Cotire which does not handle the dependencies correctly (https://github.com/sakra/cotire/issues/77).

My script is based on Christoph Heindl's work although heavily modified. So here it is:

https://sourceforge.net/p/aiboplus/code/ci/master/tree/aiboplus/UnityBuild.cmake

The features of my Unity Build script for CMake:
- Easy to add to existing projects
- Configurable extension for the generated Unity Build files (c, cpp, cc etc.)
- Easy to exclude certain problematic sources from the Unity Build file generation
- Source file count limit for the Unity Build files
- Working out-of-source builds
- Qt support (handling moc files correctly)
- Track the source file changes with md5 hashes
- Regenerate the Unity Build files only when really needed

Limitation:
- The Unity Build files are not removed with "make clean".

Note:
- The UNITY_GENERATE_MOC() macro is optional. I wrote this lightweight moc file generation for Qt instead of the default slower moc invocations in CMake.

Basic usage of my script:

- Copy UnityBuild.cmake to your project's root directory.
- Include in the root CMakeLists.txt:

INCLUDE(UnityBuild.cmake)

- In any source directory, use like this:

SET(LIBEXAMPLE_SRC
    first.c ;
    second.c ;
    third.c ;
    )

# Parameters:
# 1. Name prefix for the generated Unity Build files
# 2. CMake variable which contains the source files
# 3. Unit size (source file count) per Unity Build file
# 4. Extension for the Unity Build files to invoke the correct compiler
ENABLE_UNITY_BUILD(libexample LIBEXAMPLE_SRC 10 c)

# Optional: Add any problematic source file which are not suitable for Unity Build and
# you are lazy to fix it.
SET(LIBEXAMPLE_SRC ${LIBEXAMPLE_SRC} lazy.c)

ADD_LIBRARY(libexample STATIC ${LIBEXAMPLE_SRC})

2015. augusztus 13., csütörtök

Shishi odoshi vs. nyúl

Szal, egy nyúl ma azt hitte, hogy felzabálja a büszke bokrunkat a kertben. Aztán a shishi odoshi helyrerakta.