diff options
| -rw-r--r-- | README.md | 79 | ||||
| -rw-r--r-- | error.json | 345 | ||||
| -rw-r--r-- | ne_10m_admin_0_countries.zip | bin | 0 -> 5012760 bytes | |||
| -rw-r--r-- | ne_10m_admin_0_countries/ne_10m_admin_0_countries.cpg | 1 | ||||
| -rw-r--r-- | ne_10m_admin_0_countries/ne_10m_admin_0_countries.dbf | bin | 0 -> 8744936 bytes | |||
| -rw-r--r-- | ne_10m_admin_0_countries/ne_10m_admin_0_countries.prj | 1 | ||||
| -rw-r--r-- | ne_10m_admin_0_countries/ne_10m_admin_0_countries.qmd | 26 | ||||
| -rw-r--r-- | ne_10m_admin_0_countries/ne_10m_admin_0_countries.shp | bin | 0 -> 8806180 bytes | |||
| -rw-r--r-- | ne_10m_admin_0_countries/ne_10m_admin_0_countries.shx | bin | 0 -> 2164 bytes | 
9 files changed, 438 insertions, 14 deletions
@@ -6,15 +6,16 @@ working on. I was attempting to import [the  world](https://www.naturalearthdata.com/downloads/10m-cultural-vectors/)  in Elastic, but Elastic has some bug where you can't upload GeoJSON  through the web form, so I had to do it manually, like this: +  ```bash -NAME,ECONOMY,FORMAL_EN,GDP_MD,ISO_A2 -ogr2ogr -f ElasticSearch  -progress \ --select $fields \ --lco NOT_ANALYZED_FIELDS=$fields \ --lco INDEX_NAME=countries \ --lco OVERWRITE_INDEX=YES \ -ES:http://localhost:9200 \ -/vsizip/./ne_10m_admin_0_countries.zip/ne_10m_admin_0_countries.shp +$ fields=NAME,ECONOMY,FORMAL_EN,GDP_MD,ISO_A2 +$ ogr2ogr -f ElasticSearch  -progress \ +    -select $fields \ +    -lco NOT_ANALYZED_FIELDS=$fields \ +    -lco INDEX_NAME=countries \ +    -lco OVERWRITE_INDEX=YES \ +    ES:http://localhost:9200 \ +    /vsizip/./ne_10m_admin_0_countries.zip/ne_10m_admin_0_countries.shp  ```  But ogr2ogr yells at you after processing about 170 countries or so. If @@ -39,15 +40,65 @@ Fortunately, QGIS has a Geometry Checker Plugin, but unfortunately, it's  a bit complicated and was a pain to do. If you don't tune it right, you  end up having to sort through lots of "mistakes" which aren't mistakes. -Also there's an [unfixed bug](https://github.com/qgis/QGIS/issues/37527)  -in QGIS which doesn't make shape files correctly. -Don't know how that's possible considering that's -literally what the software's made for, but I could only get geojson -input to work correctly. -  For anyone else who might be down this rabbit hole, Egypt is Object ID  161--I promise that will save you time. Or you could just download my  copy of the file here.  Hoping to use this git repo as part of a bug report, once I read their  process on that. + +Included here is [ESRI shape +file](https://www.loc.gov/preservation/digital/formats/fdd/fdd000280.shtml) +in the `ne_10m_admin_0_countries` directory as well as the same output +in GeoJSON, since I think the format is a bit easier to work with. + + +## Steps to reproduce the bug + +1. Download the original file from Natural Earth + +```bash +wget https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip +``` + +2. Try to import the file into Elastic with the series of bash +   commands given earlier. Or alternatively, just: + +```bash +$ ogr2ogr -f ElasticSearch  -progress \ +    -lco NOT_ANALYZED_FIELDS={ALL} \ +    -lco INDEX_NAME=countries \ +    -lco OVERWRITE_INDEX=YES \ +    ES:http://localhost:9200 \ +    /vsizip/./ne_10m_admin_0_countries.zip/ne_10m_admin_0_countries.shp +``` + +3. Observe you receive a similar error as given in `error.json` + +As a sanity check, you can re-run the same command without the fancy zip +syntax by manually unzipping: + +```bash +$ mkdir -p ne && unzip ne_10m_admin_0_countries.zip -d ne/ +$ ogr2ogr -f ElasticSearch  -progress \ +    -lco NOT_ANALYZED_FIELDS={ALL} \ +    -lco INDEX_NAME=countries \ +    -lco OVERWRITE_INDEX=YES \ +    ES:http://localhost:9200 \ +    ne/ne_10m_admin_0_countries.shp +``` + +You will get the same error + +### Notes + +Oddly enough, converting to other formats *will not* yield the same +error. I suspect there is some check that's not done by the GeoJSON +(and other) drivers that the Elastic one does. + + +``` bash +$ ogr2ogr -progress -f GeoJSON test.geojson /vsizip/./ne_10m_admin_0_countries.zip/ne_10m_admin_0_countries.shp +``` + +^That runs just fine diff --git a/error.json b/error.json new file mode 100644 index 0000000..4f2496c --- /dev/null +++ b/error.json @@ -0,0 +1,345 @@ +{ +  "took": 839, +  "errors": true, +  "items": [ +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "F9aIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 156, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "GNaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 157, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "GdaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 158, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "GtaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 159, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "G9aIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 160, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "HNaIWH8BuPJM6EPks_66", +        "status": 400, +        "error": { +          "type": "mapper_parsing_exception", +          "reason": "failed to parse field [geometry] of type [geo_shape]", +          "caused_by": { +            "type": "illegal_argument_exception", +            "reason": "Self-intersection at or near point [35.621087106,23.139292914]" +          } +        } +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "HdaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 161, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "HtaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 162, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "H9aIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 163, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "INaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 164, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "IdaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 165, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "ItaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 166, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "I9aIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 167, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "JNaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 168, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "JdaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 169, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "JtaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 170, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "J9aIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 171, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "KNaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 172, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "KdaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 173, +        "_primary_term": 1, +        "status": 201 +      } +    }, +    { +      "index": { +        "_index": "countries", +        "_type": "_doc", +        "_id": "KtaIWH8BuPJM6EPks_66", +        "_version": 1, +        "result": "created", +        "_shards": { +          "total": 2, +          "successful": 1, +          "failed": 0 +        }, +        "_seq_no": 174, +        "_primary_term": 1, +        "status": 201 +      } +    } +  ] +} diff --git a/ne_10m_admin_0_countries.zip b/ne_10m_admin_0_countries.zip Binary files differnew file mode 100644 index 0000000..39433bb --- /dev/null +++ b/ne_10m_admin_0_countries.zip diff --git a/ne_10m_admin_0_countries/ne_10m_admin_0_countries.cpg b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.cpg new file mode 100644 index 0000000..3ad133c --- /dev/null +++ b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.cpg @@ -0,0 +1 @@ +UTF-8
\ No newline at end of file diff --git a/ne_10m_admin_0_countries/ne_10m_admin_0_countries.dbf b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.dbf Binary files differnew file mode 100644 index 0000000..3d734c9 --- /dev/null +++ b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.dbf diff --git a/ne_10m_admin_0_countries/ne_10m_admin_0_countries.prj b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.prj new file mode 100644 index 0000000..f45cbad --- /dev/null +++ b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.prj @@ -0,0 +1 @@ +GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]
\ No newline at end of file diff --git a/ne_10m_admin_0_countries/ne_10m_admin_0_countries.qmd b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.qmd new file mode 100644 index 0000000..e6f3dba --- /dev/null +++ b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.qmd @@ -0,0 +1,26 @@ +<!DOCTYPE qgis PUBLIC 'http://mrcc.com/qgis.dtd' 'SYSTEM'> +<qgis version="3.22.4-Białowieża"> +  <identifier></identifier> +  <parentidentifier></parentidentifier> +  <language></language> +  <type>dataset</type> +  <title></title> +  <abstract></abstract> +  <links/> +  <fees></fees> +  <encoding></encoding> +  <crs> +    <spatialrefsys> +      <wkt></wkt> +      <proj4></proj4> +      <srsid>0</srsid> +      <srid>0</srid> +      <authid></authid> +      <description></description> +      <projectionacronym></projectionacronym> +      <ellipsoidacronym></ellipsoidacronym> +      <geographicflag>false</geographicflag> +    </spatialrefsys> +  </crs> +  <extent/> +</qgis> diff --git a/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shp b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shp Binary files differnew file mode 100644 index 0000000..28b350d --- /dev/null +++ b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shp diff --git a/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shx b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shx Binary files differnew file mode 100644 index 0000000..f04365d --- /dev/null +++ b/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shx  | 
