Archive for January, 2010

China Map Deviation as a Regression Problem


All published maps of China are deviated. GPS devices sold in China are modified to give the same deviated coordinates. If you don’t know, you may read here, here, here or here. Fortunately, the same deviation algorithm is applied on all maps I have seen, including Garmin (unistrong in China) GPS maps,, Google Maps/Earth. The algorithm is secrete and is only accessible by authority and companies such as garmin and google. Needless to say, this is very annoying for GPS users. Many individuals tried to discover the deviation algorithm by GPS measurement and correlation and found the algorithm to be not only nonlinear but very complicated to describe.

I accidentally found the Chinese version of Google Map to be able to correlate satellite image with map, and it gives the amount of deviation for any location in China. This URL queries the deviation of 34.29273N,108.94695E (Xi’an):,0.001&t=h&z=18&vp=$34.29273,108.94695 (seems it’ doesn’t work now)

With enough sample data, we should be able to get a regression function, which, should resemble the deviation algorithm. I’m not good at regression so I’m putting up all my data and hope someone can help out. It can be downloaded from here. The format is very simple: four fields (longitude, latitude, longitude deviation and latitude deviation) separated by tab. Longitude deviation means (deviated_longitude – true_longitude). The points are sampled with 0.025 degree separation, i.e. 40 samples per degree. There are 1529737 points (lines of text) in the file. Only points in mainland China are available. Figure 1 and 2 shows an overview of the data.

There is another file, which contains samples from 8 selected lines (4 west-east, 4 south-north). The sample resolution is higher (200 samples per degree). It is used to plot Figure 3-6. I think it’s helpful for regression analysis.

Here are the plots of the data:

latitude deviation shown in color

Fig. 1. latitude deviation shown in color

longitude deviation shown in color

Fig. 2. longitude deviation shown in color

Fig. 3. longitude deviation v.s. longitude

Fig. 4. latitude deviation v.s. longitude

Fig. 5. longitude deviation v.s. latitude

Fig. 6. latitude deviation v.s. latitude

Some observations:

  1. The longitude deviation is always positive (deviate to the east). The maximum is 0.0085562 degree.
  2. The latitude deviation ranges from -0.0038542 (to the south) to +0.0028230 (to the north) degree.
  3. It’s very obvious that there are sinusoid component of period 1 and 1/3. (see Fig. 3, 4 and 6)
  4. Fig. 4. looks simple. You may think it’s f(x)=b*sin(a*x) + b*sin(3*a*x) + c*x + d. You are wrong. There are other small components.
  5. To make discussion easier, let’s define fdx(x,y) to be the longitude deviation of a point with longitude x and latitude y. Similarly, fdy(x,y) to be the latitude deviation of that point. So, Figure 3 shows fdx(x,25.12), fdx(x,32.24), … Figure 6 shows fdy(85.52,x), fdy(97.84,x)
  6. I suggest using fourier transform, but I’m not good at it.

Happy regression!