From 81936e1e2cfbbbf903f7a461a2df61ec3bee127d Mon Sep 17 00:00:00 2001 From: mjfernez Date: Sat, 5 Mar 2022 00:42:18 -0500 Subject: Add bug details, full shapefile, and nonsense This commit adds more details about the bug to the README, including steps to reproduce. I also realized the fixed shapefile from QGIS was exporting just fine all along, but I was zipping it into a directory and therefore was just entering the path wrong. So obviously... ogr didn't recognize that. There is no issue with QGIS --- README.md | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 65 insertions(+), 14 deletions(-) (limited to 'README.md') diff --git a/README.md b/README.md index edbf42d..90446eb 100644 --- a/README.md +++ b/README.md @@ -6,15 +6,16 @@ working on. I was attempting to import [the world](https://www.naturalearthdata.com/downloads/10m-cultural-vectors/) in Elastic, but Elastic has some bug where you can't upload GeoJSON through the web form, so I had to do it manually, like this: + ```bash -NAME,ECONOMY,FORMAL_EN,GDP_MD,ISO_A2 -ogr2ogr -f ElasticSearch -progress \ --select $fields \ --lco NOT_ANALYZED_FIELDS=$fields \ --lco INDEX_NAME=countries \ --lco OVERWRITE_INDEX=YES \ -ES:http://localhost:9200 \ -/vsizip/./ne_10m_admin_0_countries.zip/ne_10m_admin_0_countries.shp +$ fields=NAME,ECONOMY,FORMAL_EN,GDP_MD,ISO_A2 +$ ogr2ogr -f ElasticSearch -progress \ + -select $fields \ + -lco NOT_ANALYZED_FIELDS=$fields \ + -lco INDEX_NAME=countries \ + -lco OVERWRITE_INDEX=YES \ + ES:http://localhost:9200 \ + /vsizip/./ne_10m_admin_0_countries.zip/ne_10m_admin_0_countries.shp ``` But ogr2ogr yells at you after processing about 170 countries or so. If @@ -39,15 +40,65 @@ Fortunately, QGIS has a Geometry Checker Plugin, but unfortunately, it's a bit complicated and was a pain to do. If you don't tune it right, you end up having to sort through lots of "mistakes" which aren't mistakes. -Also there's an [unfixed bug](https://github.com/qgis/QGIS/issues/37527) -in QGIS which doesn't make shape files correctly. -Don't know how that's possible considering that's -literally what the software's made for, but I could only get geojson -input to work correctly. - For anyone else who might be down this rabbit hole, Egypt is Object ID 161--I promise that will save you time. Or you could just download my copy of the file here. Hoping to use this git repo as part of a bug report, once I read their process on that. + +Included here is [ESRI shape +file](https://www.loc.gov/preservation/digital/formats/fdd/fdd000280.shtml) +in the `ne_10m_admin_0_countries` directory as well as the same output +in GeoJSON, since I think the format is a bit easier to work with. + + +## Steps to reproduce the bug + +1. Download the original file from Natural Earth + +```bash +wget https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries.zip +``` + +2. Try to import the file into Elastic with the series of bash + commands given earlier. Or alternatively, just: + +```bash +$ ogr2ogr -f ElasticSearch -progress \ + -lco NOT_ANALYZED_FIELDS={ALL} \ + -lco INDEX_NAME=countries \ + -lco OVERWRITE_INDEX=YES \ + ES:http://localhost:9200 \ + /vsizip/./ne_10m_admin_0_countries.zip/ne_10m_admin_0_countries.shp +``` + +3. Observe you receive a similar error as given in `error.json` + +As a sanity check, you can re-run the same command without the fancy zip +syntax by manually unzipping: + +```bash +$ mkdir -p ne && unzip ne_10m_admin_0_countries.zip -d ne/ +$ ogr2ogr -f ElasticSearch -progress \ + -lco NOT_ANALYZED_FIELDS={ALL} \ + -lco INDEX_NAME=countries \ + -lco OVERWRITE_INDEX=YES \ + ES:http://localhost:9200 \ + ne/ne_10m_admin_0_countries.shp +``` + +You will get the same error + +### Notes + +Oddly enough, converting to other formats *will not* yield the same +error. I suspect there is some check that's not done by the GeoJSON +(and other) drivers that the Elastic one does. + + +``` bash +$ ogr2ogr -progress -f GeoJSON test.geojson /vsizip/./ne_10m_admin_0_countries.zip/ne_10m_admin_0_countries.shp +``` + +^That runs just fine -- cgit v1.2.3