diff options
-rw-r--r-- | README.md | 79 | ||||
-rw-r--r-- | error.json | 345 | ||||
-rw-r--r-- | ne_10m_admin_0_countries.zip | bin | 0 -> 5012760 bytes | |||
-rw-r--r-- | ne_10m_admin_0_countries/ne_10m_admin_0_countries.cpg | 1 | ||||
-rw-r--r-- | ne_10m_admin_0_countries/ne_10m_admin_0_countries.dbf | bin | 0 -> 8744936 bytes | |||
-rw-r--r-- | ne_10m_admin_0_countries/ne_10m_admin_0_countries.prj | 1 | ||||
-rw-r--r-- | ne_10m_admin_0_countries/ne_10m_admin_0_countries.qmd | 26 | ||||
-rw-r--r-- | ne_10m_admin_0_countries/ne_10m_admin_0_countries.shp | bin | 0 -> 8806180 bytes | |||
-rw-r--r-- | ne_10m_admin_0_countries/ne_10m_admin_0_countries.shx | bin | 0 -> 2164 bytes |
9 files changed, 438 insertions, 14 deletions
@@ -6,15 +6,16 @@ working on. I was attempting to import [the world](https://www.naturalearthdata.com/downloads/10m-cultural-vectors/) in Elastic, but Elastic has some bug where you can't upload GeoJSON through the web form, so I had to do it manually, like this: + ```bash -NAME,ECONOMY,FORMAL_EN,GDP_MD,ISO_A2 -ogr2ogr -f ElasticSearch -progress \ --select $fields \ --lco NOT_ANALYZED_FIELDS=$fields \ --lco INDEX_NAME=countries \ --lco OVERWRITE_INDEX=YES \ -ES:http://localhost:9200 \ -/vsizip/./ne_10m_admin_0_countries.zip/ne_10m_admin_0_countries.shp +$ fields=NAME,ECONOMY,FORMAL_EN,GDP_MD,ISO_A2 +$ ogr2ogr -f ElasticSearch -progress \ + -select $fields \ + -lco NOT_ANALYZED_FIELDS=$fields \ + -lco INDEX_NAME=countries \ + -lco OVERWRITE_INDEX=YES \ + ES:http://localhost:9200 \ + /vsizip/./ne_10m_admin_0_countries.zip/ne_10m_admin_0_countries.shp ``` But ogr2ogr yells at you after processing about 170 countries or so. If @@ -39,15 +40,65 @@ Fortunately, QGIS has a Geometry Checker Plugin, but unfortunately, it's a bit complicated and was a pain to do. If you don't tune it right, you end up having to sort through lots of "mistakes" which aren't mistakes. -Also there's an [unfixed bug](https://github.com/qgis/QGIS/issues/37527) -in QGIS which doesn't make shape files correctly. -Don't know how that's possible considering that's -literally what the software's made for, but I could only get geojson -input to work correctly. - For anyone else who might be down this rabbit hole, Egypt is Object ID 161--I promise that will save you time. Or you could just download my copy of the file here. Hoping to use this git repo as part of a bug report, once I read their process on that. + +Included here is [ESRI shape +file](https://www.loc.gov/preservation/digital/formats/fdd/fdd000280.shtml) +in the `ne_10m_admin_0_countries` directory as well as the same output +in GeoJSON, since I think the format is a bit easier to work with. + + +## Steps to reproduce the bug + +1. Download the original file from Natural Earth + +```bash +wget https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip +``` + +2. Try to import the file into Elastic with the series of bash + commands given earlier. Or alternatively, just: + +```bash +$ ogr2ogr -f ElasticSearch -progress \ + -lco NOT_ANALYZED_FIELDS={ALL} \ + -lco INDEX_NAME=countries \ + -lco OVERWRITE_INDEX=YES \ + ES:http://localhost:9200 \ + /vsizip/./ne_10m_admin_0_countries.zip/ne_10m_admin_0_countries.shp +``` + +3. Observe you receive a similar error as given in `error.json` + +As a sanity check, you can re-run the same command without the fancy zip +syntax by manually unzipping: + +```bash +$ mkdir -p ne && unzip ne_10m_admin_0_countries.zip -d ne/ +$ ogr2ogr -f ElasticSearch -progress \ + -lco NOT_ANALYZED_FIELDS={ALL} \ + -lco INDEX_NAME=countries \ + -lco OVERWRITE_INDEX=YES \ + ES:http://localhost:9200 \ + ne/ne_10m_admin_0_countries.shp +``` + +You will get the same error + +### Notes + +Oddly enough, converting to other formats *will not* yield the same +error. I suspect there is some check that's not done by the GeoJSON +(and other) drivers that the Elastic one does. + + +``` bash +$ ogr2ogr -progress -f GeoJSON test.geojson /vsizip/./ne_10m_admin_0_countries.zip/ne_10m_admin_0_countries.shp +``` + +^That runs just fine diff --git a/error.json b/error.json new file mode 100644 index 0000000..4f2496c --- /dev/null +++ b/error.json @@ -0,0 +1,345 @@ +{ + "took": 839, + "errors": true, + "items": [ + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "F9aIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 156, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "GNaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 157, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "GdaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 158, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "GtaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 159, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "G9aIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 160, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "HNaIWH8BuPJM6EPks_66", + "status": 400, + "error": { + "type": "mapper_parsing_exception", + "reason": "failed to parse field [geometry] of type [geo_shape]", + "caused_by": { + "type": "illegal_argument_exception", + "reason": "Self-intersection at or near point [35.621087106,23.139292914]" + } + } + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "HdaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 161, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "HtaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 162, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "H9aIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 163, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "INaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 164, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "IdaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 165, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "ItaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 166, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "I9aIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 167, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "JNaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 168, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "JdaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 169, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "JtaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 170, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "J9aIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 171, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "KNaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 172, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "KdaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 173, + "_primary_term": 1, + "status": 201 + } + }, + { + "index": { + "_index": "countries", + "_type": "_doc", + "_id": "KtaIWH8BuPJM6EPks_66", + "_version": 1, + "result": "created", + "_shards": { + "total": 2, + "successful": 1, + "failed": 0 + }, + "_seq_no": 174, + "_primary_term": 1, + "status": 201 + } + } + ] +} diff --git a/ne_10m_admin_0_countries.zip b/ne_10m_admin_0_countries.zip Binary files differnew file mode 100644 index 0000000..39433bb --- /dev/null +++ b/ne_10m_admin_0_countries.zip diff --git a/ne_10m_admin_0_countries/ne_10m_admin_0_countries.cpg b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.cpg new file mode 100644 index 0000000..3ad133c --- /dev/null +++ b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.cpg @@ -0,0 +1 @@ +UTF-8
\ No newline at end of file diff --git a/ne_10m_admin_0_countries/ne_10m_admin_0_countries.dbf b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.dbf Binary files differnew file mode 100644 index 0000000..3d734c9 --- /dev/null +++ b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.dbf diff --git a/ne_10m_admin_0_countries/ne_10m_admin_0_countries.prj b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.prj new file mode 100644 index 0000000..f45cbad --- /dev/null +++ b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.prj @@ -0,0 +1 @@ +GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]
\ No newline at end of file diff --git a/ne_10m_admin_0_countries/ne_10m_admin_0_countries.qmd b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.qmd new file mode 100644 index 0000000..e6f3dba --- /dev/null +++ b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.qmd @@ -0,0 +1,26 @@ +<!DOCTYPE qgis PUBLIC 'http://mrcc.com/qgis.dtd' 'SYSTEM'> +<qgis version="3.22.4-Białowieża"> + <identifier></identifier> + <parentidentifier></parentidentifier> + <language></language> + <type>dataset</type> + <title></title> + <abstract></abstract> + <links/> + <fees></fees> + <encoding></encoding> + <crs> + <spatialrefsys> + <wkt></wkt> + <proj4></proj4> + <srsid>0</srsid> + <srid>0</srid> + <authid></authid> + <description></description> + <projectionacronym></projectionacronym> + <ellipsoidacronym></ellipsoidacronym> + <geographicflag>false</geographicflag> + </spatialrefsys> + </crs> + <extent/> +</qgis> diff --git a/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shp b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shp Binary files differnew file mode 100644 index 0000000..28b350d --- /dev/null +++ b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shp diff --git a/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shx b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shx Binary files differnew file mode 100644 index 0000000..f04365d --- /dev/null +++ b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shx |