What is Software Architecture?

Software architect looks like a prestige job, but to me, it is nothing different from an engineering with a deeper understanding about the problem domain and limitations.

As an architect, I look at my job from different dimensions. This ability is very important or you will fall into the trap of over-engineering or under design(It may be the case for my daily life).

1. Business Dimension: Understand the problem

The key difference between a software architect and a general programmer is not simply the technical competence, it should be the degree of understanding to a business domain. Without such understanding, an architect cannot make reasonable trade-off among different solutions. It covers the flexibility, configuration, performance and loading concern.

2. Structural Dimension: How the code are organized and grouped?

The main difference between a good program and bad program is how the program is structured so that it can truly reflect the current and upcoming business use cases. I suggest to start small, and try to explain the function of each module in “1 simple sentence”. You can refactor the project later on if your business grows.

3. Tier Dimension: Classical MVC? Restful => Spring => ESB => DB?

Once we defined a business use cases, we can design the data flow. Usually data or request will flow through different layers which have different concerns. MVC is a classical 3-tier architecture, which C & V are external facing while M are the business objects, we can then focus on the responsibilities of each tier. Furthermore, there are more layering architecture, for example, Events => Karfka => Cassandra for data warehouse. We need to broaden our eye to understand how others solve similar problems

4. Library Dimension: What and why we can choose

We should not reinvent the wheel, when we want to implement something, we should first Google to see if there is any available libraries which serves the purposes. Using other’s libraries can save your time and most importantly, keep some design frauds away from your system. Of course, if you are an experienced architect, you can quickly sniff a library fit your needs or not, it is the key value you added to your team.

Install Superset with Ubuntu 16.04 with venv

sudo apt-get install build-essential libssl-dev libffi-dev python-dev python-pip libsasl2-dev libldap2-dev python3-dev

python3 -m venv superset-venv

source superset-venv/bin/activate

pip install --upgrade setuptools pip

pip install superset

# The following copy from https://superset.incubator.apache.org/installation.html

# Create an admin user (you will be prompted to set username, first and last name before setting a password)
fabmanager create-admin --app superset

# Initialize the database
superset db upgrade

# Load some data to play with
superset load_examples

# Create default roles and permissions
superset init

# Start the web server on port 8088, use -p to bind to another port
superset runserver

Ubuntu KVM virtualization with GPU Passthrough

Linux is equipped with KVM, which is another hypervisor at the same level of VMWare and VirtualBox. However, it has the great capability for GPU pass through, which grant the guest system to access GPU natively.

The function is great, but the setup is a bit complicated because it involves low level configuration of Linux to mask the GPU away from kernel, which allow the guest to use it exclusively.

To do so, we need to do the following steps.
1. Install the KVM as followed

sudo apt-get install qemu-kvm libvirt-bin virtinst bridge-utils cpu-checker virt-manager ovmf

2. Rebuild the initramfs for the kernel, so that it can load before proper Radeon or AMDGPU load.

jimmy@jimmy-home:~$ cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.

jimmy@jimmy-home:~$ cat /etc/initramfs-tools/modules 
# List of modules that you want to include in your initramfs.
# They will be loaded at boot time in the order below.
# Syntax:  module_name [args ...]
# You must run update-initramfs(8) to effect this change.
# Examples:
# raid1
# sd_mod
pci_stub ids=1002:683d,1002:aab0
jimmy@jimmy-home:~$ sudo update-initramfs -u

3. Config the kernel to load vfio-pci before loading any GPU driver, also blacklist the GPU hardware ID which look up in “lspci -nnk”. Furthermore, you can verify the status with “lspci -nnk” make sure the driver for GPU is vfio-pci rather than radeon

jimmy@jimmy-home:~$ vi /etc/modprobe.d/vfio.conf 
softdep radeon pre: vfio-pci
softdep amdgpu pre: vfio-pci
options vfio-pci ids=1002:683d,1002:aab0 disable_vga=1
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde XT [Radeon HD 7770/8760 / R7 250X] [1002:683d]
	Subsystem: PC Partner Limited / Sapphire Technology Cape Verde XT [Radeon HD 7770/8760 / R7 250X] [174b:e244]
	Kernel driver in use: vfio-pci
	Kernel modules: radeon, amdgpu
01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] [1002:aab0]
	Subsystem: PC Partner Limited / Sapphire Technology Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] [174b:aab0]
	Kernel driver in use: vfio-pci
	Kernel modules: snd_hda_intel

4. It is not a good practice to run libvirtd with root, but it is a quick way to let libvirtd to access an attachable storage.

Changing /etc/libvirt/qemu.conf to make things work.
Uncomment user/group to work as root.
Then restart libvirtd

The following are the reference I checked


Nokia’s study

Nokia is a interesting company to study, especially when u are looking from an engineering perspective. It has the best engineer team, which lead to the success of Nokia at 199x, but it failed to adapt to the new era and decline.

I think the decline is due to the company can no longer blend tech and business well, given the explosive growth of business.

(Below is bit technical, but it supports my view of the conflict between tech and business)

Nokia’s management heavily invest in R&D, one of the key research area is embedded Linux, which is another stream outside the Symbian OS. It has the concept of App, but the App is hard to code. The App is written on C++ and use Qt as the UI library. It is really a natural choice because this couple has been around with Linux since day 1. The hurdle for writing App is the exceptionally low computational power (<100Mhz). exactly the same like today programmers, learning Java is easy, learning GOOD Java is hard.

Furthermore, it seems Nokia business side doesn’t want to promote the embedded Linux concept, every one heard the common models, 8810, 8850, N73, N95, 5800 and etc. But seems no one have seen Nokia Communicator or E90.

Later on, the industry come up with J2ME, which partially solve the hardware management issue, but still, very hard to code.

Having the app concept without a complete end-to-end use cases means no one will use app. Finding, buying, downloading and installing an app takes more than two hours. You cannot expect your grandma can do it on her own. That’s why I think App Store kills Nokia.

It simply lost the first mover advantage.

After that, there are two chances for Nokia to turn around, at least capture a decent portion in the market. The first one is launching Nokia 5800 Xpress Music in 2008 and the second one is choosing Windows, rather than Android.

Nokia 5800 was considered as a game changer, with a touch screen, on Symbian S60 V5, 3G network and driven by Music as opposed to iTunes that time. The key failure is using Resistive display (in contrast to capacitive display by iPhone 1G), which limit the usage with a Pen and no Multi-touch Gesture. The product is just a clone of iPhone features superficially, without knowing the key market value. It simply fails to blend tech and design and maximize their advantages. At the same time, other competitors start adapting Android at top tier phone which lag Nokia behind.

The second chance is terminating Symbian OS, and choosing M$ Windows instead of Android. I am not saying the Stephen Elop Trojan Horse(Smile) case, just the management fails to do the SWOT analysis well. Android must be a better choice given Nokia strong engineering team in embedded system, which is transitive to Android. Nokia also uses ARM chips since 5800. It is essentially asking a Java Developer to write a C# program from scratch, it simply gives up all the know-how.

All in all, the success of Apple is not simply UI/UX, but the complete supply chain of App. Nokia failures is caused by failure to understand the market.

PS. I am looking at Nokia 8 recently, even though it is not the original Nokia, the minimal customization Android and steel body may make it a good choice for me.


SQL and traditional RDBMS is inevitably a key surviving skill for every programmer. SQL is good given that it is more or less an universal standard in database, or when we want to interact with any database. RDBMS is good because they provide ACID guarantee for programmer, which drastic simplify programmer life especially in web environment.

However, RDBMS suffers from scalability and concurrency problem when it comes to web scale. The common 1 master + N slaves or data sharding technique only postpone the problems by several times. It also posts a few limitations like Read-Write ratio and the partition key has to be carefully selected, which means the designers have to aware of the limitation.

Some people start moving on to NoSQL, whether it is NO SQL or NOT ONLY SQL is debatable. However, from an engineering point of view, NoSQL is a solution to a specific problem but not a silver bullet.

In general, NoSQL can be further divided into the following categories.
– Graph Database – Neo4j
– Document Base Database – MongoDB
– Key Value Database – Redis / Memcache
– Column Oriented Database – Cassandra
– Time Series Database (Extension) – Riak DB / OpenTSDB

Each of them are solving a particular business model or use cases. For example, Graph DB are used to handle parties and relationships very efficiently. Document Base DB can handle hierarchy data better than RDBMS. We usually trade scalability for giving up Transaction capability.

The famous CAP theorem (Brewer’s theorem) describes the situation that Consistency, Availability and Partition Tolerance are mutually exclusive, we can only pick two out of three. Each NoSQL DB and RDBMS are following this rules with NO EXCEPTION.

SQL is still widely used as of today, since many software are not really web scale or the management doesn’t know or want to pay the cost for web scale. So, people are dreaming to have SQL & RDBMS capabilities while having the NoSQL scalability? It becomes the goal for NewSQL. Google (The God, again) has published a paper for the concept of Spanner, which is now in production in Google Cloud Platform.


In short, it tries to detach the TX manager and the underlying Storage Manager, so that for each query or update, the TX Manager can acquire the relevant Storage managers. Since the Storage is handle locally, the storage itself is usually bigger in size (64MB) and can be distributed. It fits perfectly in AWS S3 or Google Cloud Storage. However, as it involves Network operations, the overall performance should be slower than a local DB with smaller dataset.

There are other local implementations, like VoltDB or CockroachDB.

Machine Learning Notes

Every people talks about Machine Learning, Artificial Intelligence, Big Data and Data Analysis after Alpha Go launched. It is one of the hot topic recently, but seems not so many people really knows how to use it.

In general, classical AI are just solving the following problems, nothing more than that. It looks like a statistic problem rather than solving an algorithm problem.

1. Supervised Learning
a. Classification
b. Regression

2. Unsupervised Learning
a. Clustering

Supervised learning means we know the training data result, the algorithm is helping us to predict an result statistically given some an unseen data.

Unsupervised learning means we are digging gold from garbage. We don’t know the expected result. A classical example is I am given a list of personal information, try to group them in 4 categories, where 4 is defined by the algorithm users.

Classification and Regression are two key usage for supervised learning. We are solving the questions “Given your previous experience (Training Data), what is the expected value of the unseen value?” Classification is just the discrete form of regression, of course, there are many algorithm like Decision Tree only works on discrete data.

The following picture is from scikit-learn which is a Python library commonly used for Data Analysis and Machine Learning, it has a MindMap for us to determine the algorithm to be used.

After understanding what question we are solving, the next question is obviously HOW.

We have a “pipeline” concept in most machine learning library, it is a standardized steps for training a machine learning model. However, it is not different from the programming “Input”-“Process”-“Output” model. In the other words, it doesn’t have a magic ward.

A data scientists are free to select the algorithm for each step, and linking all these steps becomes a machine learning model.

Linux Software Raid 1

First, install the Multi Disk Admin tools

sudo apt-get install initramfs-tools mdadm 

Next, set the Partition Type to “fd Linux raid auto”

sudo fdisk /dev/sdb
sudo fdisk /dev/sdc

Create the MD(md0)

sudo mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdb1 /dev/sdc1

Format the partition with Ext4

sudo mkfs.ext4 /dev/md0

Command to check RAID status

sudo mdadm --query /dev/md0
sudo mdadm --detail /dev/md0

Nginx Reverse Proxy

I am playing around JHipster recently, one of the goal for it is the Micro services architecture.

Under such assumption, a reverse proxy is inevitable. Apart from HAProxy, I try to use Nginx as the reverse proxy. Here is the config I use.

# You should look at the following URL's in order to grasp a solid understanding
# of Nginx configuration files in order to fully unleash the power of Nginx.
# http://wiki.nginx.org/Pitfalls
# http://wiki.nginx.org/QuickStart
# http://wiki.nginx.org/Configuration
# Generally, you will want to move this file somewhere, and start with a clean
# file but keep this around for reference. Or just disable in sites-enabled.
# Please see /usr/share/doc/nginx-doc/examples/ for more detailed examples.

#### JHipster SPECIFIC ROUTE ####
upstream jhipster {
	server localhost:8080 weight=10 max_fails=3 fail_timeout=30s;
	server localhost:18080 weight=10 max_fails=3 fail_timeout=30s;
	server localhost:28080 weight=10 max_fails=3 fail_timeout=30s;
#### JHipster SPECIFIC ROUTE ####

# Default server configuration
server {
	listen 80 default_server;
	listen [::]:80 default_server;

	# SSL configuration
	# listen 443 ssl default_server;
	# listen [::]:443 ssl default_server;
	# Note: You should disable gzip for SSL traffic.
	# See: https://bugs.debian.org/773332
	# Read up on ssl_ciphers to ensure a secure configuration.
	# See: https://bugs.debian.org/765782
	# Self signed certs generated by the ssl-cert package
	# Don't use them in a production server!
	# include snippets/snakeoil.conf;

	root /var/www/html;

	# Add index.php to the list if you are using PHP
	index index.html index.htm index.nginx-debian.html;

	server_name _;

#	location / {
#		# First attempt to serve request as file, then
#		# as directory, then fall back to displaying a 404.
#		try_files $uri $uri/ =404;
#	}

	location / {  
		proxy_pass http://jhipster;  
		proxy_http_version 1.1;  
		proxy_set_header Host $host;  
		proxy_set_header Upgrade $http_upgrade;  
		proxy_set_header Connection 'upgrade';  
		proxy_set_header X-Real-IP $remote_addr;
        	proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
	        proxy_set_header X-Forwarded-Proto $scheme;
		proxy_cache_bypass $http_upgrade;  

	# pass the PHP scripts to FastCGI server listening on
	#location ~ \.php$ {
	#	include snippets/fastcgi-php.conf;
	#	# With php7.0-cgi alone:
	#	fastcgi_pass;
	#	# With php7.0-fpm:
	#	fastcgi_pass unix:/run/php/php7.0-fpm.sock;

	# deny access to .htaccess files, if Apache's document root
	# concurs with nginx's one
	#location ~ /\.ht {
	#	deny all;

# Virtual Host configuration for example.com
# You can move that to a different file under sites-available/ and symlink that
# to sites-enabled/ to enable it.
#server {
#	listen 80;
#	listen [::]:80;
#	server_name example.com;
#	root /var/www/example.com;
#	index index.html;
#	location / {
#		try_files $uri $uri/ =404;
#	}

SailsJS Hello World

Sails JS is a full MVC framework that build on Express and io.js. It looks easy to use and I have tried to incorporated different functionalities in it.

You may find my hello world in the following links


I have included the following in the projects.

1. Passport.js
2. Swaggers.js
3. React + Webpack
4. Sequelize

I have also configured a few settings in the Sails JS Project.

2. Winston

Notes for Websphere and DB2 on Linux


WebSphere has to been installed with IBM Installation Manager.


After starting it, it will ask for the repository, you may need the following repository

Version 9.0: https://www.ibm.com/software/repositorymanager/V9WASILAN
Version 8.5: https://www.ibm.com/software/repositorymanager/V85WASDeveloperILAN
Version 8.0: https://www.ibm.com/software/repositorymanager/V8WASDeveloperILAN

You need to make sure you install the optional JDK_1.7.0 or JDK_1.7.1. In WebSphere, it uses a patched JDK.


Creating Database

db2 create database oms;
db2 activate database oms;
db2 connect to oms;
db2 connect to database oms user db2inst1 using password

Drop Database

In some case, there may be connection to the existing schema, so, it is easier to restart the DB before droping the tables

db2stop force
db2 drop database oms

Backup Database

db2look -d oms -z db2inst1 -e -o oms.ddl

Restore Database

db2 -vmf oms.ddl

Extract single table to ixl (IBM Information Exchange Format)

db2 “export to aa_user.del of del SELECT * FROM AA_USER”

Import Data from IXL

db2 load from /tmp/department.ixf of ixf replace into department2

Create Tablespace with larger page size

db2 create bufferpool bp32k pagesize 32K
db2 create tablespace data32k pagesize 32K bufferpool bp32K

Set rev_info for Hibernate envus

db2 alter table rev_info alter column rev set generated always as identity

Table space

db2 create bufferpool bp32k pagesize 32K
db2 create tablespace data32k pagesize 32K bufferpool bp32K

Export CLOB or BLOB
db2 “export to TBL_XXX.ixf of ixf lobs to ./ SELECT * FROM TBL_XXX”