名称

地理编码 — 接收一个地址字符串(或其他标准化的地址),并输出一组可能的位置,其中包括 NAD 83 经纬度的点几何图形、每个位置的标准化地址以及评分。评分越低,匹配的可能性越高。结果按评分从低到高排序。可以选择传入最大结果数(默认为 10)和 restrict_region(默认为 NULL)。

概要

setof record geocode(varchar address, integer max_results=10, geometry restrict_region=NULL, norm_addy OUT addy, geometry OUT geomout, integer OUT rating);

setof record geocode(norm_addy in_addy, integer max_results=10, geometry restrict_region=NULL, norm_addy OUT addy, geometry OUT geomout, integer OUT rating);

描述

接收一个地址字符串(或已经标准化的地址),并输出一组可能的位置,其中包括 NAD 83 经纬度的点几何图形、每个位置的 normalized_address (addy) 以及评分。评分越低,匹配的可能性越高。结果按评分从低到高排序。使用 Tiger 数据(edges,faces,addr)、PostgreSQL 模糊字符串匹配(soundex,levenshtein)和 PostGIS 线插值函数来沿 Tiger 边插值地址。评分越高,地理编码的准确性越低。地理编码点默认从中心线偏移 10 米到街道地址所在的一侧(左/右)。

增强功能:2.0.0 版本支持 Tiger 2010 结构化数据,并修订了一些逻辑以提高地理编码的速度和准确性,并将点从中心线偏移到街道地址所在的一侧。新的参数 max_results 可用于指定最佳结果的数量或仅返回最佳结果。

示例:基本

以下示例的计时是在一台配备 2GB 内存的 3.0 GHz 单处理器 Windows 7 机器上进行的,该机器运行 PostgreSQL 9.1rc1/PostGIS 2.0,加载了 MA、MN、CA 和 RI 州的所有 Tiger 数据。

完全匹配的计算速度更快 (61ms)

SELECT g.rating, ST_X(g.geomout) As lon, ST_Y(g.geomout) As lat,
    (addy).address As stno, (addy).streetname As street,
    (addy).streettypeabbrev As styp, (addy).location As city, (addy).stateabbrev As st,(addy).zip
    FROM geocode('75 State Street, Boston MA 02109', 1) As g;
 rating |        lon        |      lat       | stno | street | styp |  city  | st |  zip
--------+-------------------+----------------+------+--------+------+--------+----+-------
      0 | -71.0557505845646 | 42.35897920691 |   75 | State  | St   | Boston | MA | 02109

即使没有传入邮政编码,地理编码器也可以猜测(大约需要 122-150 毫秒)

SELECT g.rating, ST_AsText(ST_SnapToGrid(g.geomout,0.00001)) As wktlonlat,
    (addy).address As stno, (addy).streetname As street,
    (addy).streettypeabbrev As styp, (addy).location As city, (addy).stateabbrev As st,(addy).zip
    FROM geocode('226 Hanover Street, Boston, MA',1) As g;
 rating |         wktlonlat         | stno | street  | styp |  city  | st |  zip
--------+---------------------------+------+---------+------+--------+----+-------
      1 | POINT(-71.05528 42.36316) |  226 | Hanover | St   | Boston | MA | 02113

可以处理拼写错误,并提供多个带有评分的可能解决方案,耗时较长(500 毫秒)。

SELECT g.rating, ST_AsText(ST_SnapToGrid(g.geomout,0.00001)) As wktlonlat,
    (addy).address As stno, (addy).streetname As street,
    (addy).streettypeabbrev As styp, (addy).location As city, (addy).stateabbrev As st,(addy).zip
    FROM geocode('31 - 37 Stewart Street, Boston, MA 02116',1) As g;
 rating |         wktlonlat         | stno | street | styp |  city  | st |  zip
--------+---------------------------+------+--------+------+--------+----+-------
     70 | POINT(-71.06466 42.35114) |   31 | Stuart | St   | Boston | MA | 02116
    

用于批量地理编码地址。最简单的方法是设置 max_results=1。仅处理那些尚未地理编码的地址(没有评分)。

CREATE TABLE addresses_to_geocode(addid serial PRIMARY KEY, address text,
        lon numeric, lat numeric, new_address text, rating integer);

INSERT INTO addresses_to_geocode(address)
VALUES ('529 Main Street, Boston MA, 02129'),
 ('77 Massachusetts Avenue, Cambridge, MA 02139'),
 ('25 Wizard of Oz, Walaford, KS 99912323'),
 ('26 Capen Street, Medford, MA'),
 ('124 Mount Auburn St, Cambridge, Massachusetts 02138'),
 ('950 Main Street, Worcester, MA 01610');

-- only update the first 3 addresses (323-704 ms -  there are caching and shared memory effects so first geocode you do is always slower) --
-- for large numbers of addresses you don't want to update all at once
-- since the whole geocode must commit at once
-- For this example we rejoin with LEFT JOIN
-- and set to rating to -1 rating if no match
-- to ensure we don't regeocode a bad address
UPDATE addresses_to_geocode
  SET  (rating, new_address, lon, lat)
    = ( COALESCE(g.rating,-1), pprint_addy(g.addy),
       ST_X(g.geomout)::numeric(8,5), ST_Y(g.geomout)::numeric(8,5) )
FROM (SELECT addid, address
    FROM addresses_to_geocode
    WHERE rating IS NULL ORDER BY addid LIMIT 3) As a
    LEFT JOIN LATERAL geocode(a.address,1) As g ON true
WHERE a.addid = addresses_to_geocode.addid;

result
-----
Query returned successfully: 3 rows affected, 480 ms execution time.

SELECT * FROM addresses_to_geocode WHERE rating is not null;

 addid |                   address                    |    lon    |   lat    |                new_address                | rating
-------+----------------------------------------------+-----------+----------+-------------------------------------------+--------
     1 | 529 Main Street, Boston MA, 02129            | -71.07177 | 42.38357 | 529 Main St, Boston, MA 02129             |      0
     2 | 77 Massachusetts Avenue, Cambridge, MA 02139 | -71.09396 | 42.35961 | 77 Massachusetts Ave, Cambridge, MA 02139 |      0
     3 | 25 Wizard of Oz, Walaford, KS 99912323       | -97.92913 | 38.12717 | Willowbrook, KS 67502                     |    108
(3 rows)

示例:使用几何过滤器

SELECT g.rating, ST_AsText(ST_SnapToGrid(g.geomout,0.00001)) As wktlonlat,
    (addy).address As stno, (addy).streetname As street,
    (addy).streettypeabbrev As styp,
    (addy).location As city, (addy).stateabbrev As st,(addy).zip
  FROM geocode('100 Federal Street, MA',
        3,
        (SELECT ST_Union(the_geom)
            FROM place WHERE statefp = '25' AND name = 'Lynn')::geometry
        ) As g;

 rating |         wktlonlat         | stno | street  | styp | city | st |  zip
--------+---------------------------+------+---------+------+------+----+-------
      7 | POINT(-70.96796 42.4659)  |  100 | Federal | St   | Lynn | MA | 01905
     16 | POINT(-70.96786 42.46853) | NULL | Federal | St   | Lynn | MA | 01905
(2 rows)

Time: 622.939 ms