pyGplates API to rotate shapefiles

Hi all,

I have gone through the documentation but couldn’t find an API which can be used to rotate shapefiles of type Point, Polyline, Polygon with “from age” and “to age”.

I tried reading the geometry from shapefiles and used PolylineOnSphere and PolygonOnSphere method to create feature collection to pass to reconstruct method of pyGplates but that didn’t work and the output shapefile was same as that of input shapefile.

Instead of doing this, is there any API available which can be used to rotate shapefile directly by passing “from age” and “to age” for all type(Point, Polyline, Polygon) of shapefiles

Thanks!

Hi vishesh,

You can actually get pyGPlates to read/write Shapefiles (as opposed to pulling out the geometries yourself). To reconstruct a Shapefile from present day (0 Ma) to a past time you can use code like this example:

Exported reconstructed features to a file

…which can be simplified to just:

pygplates.reconstruct(import_shapefilename, rotation_files, export_shapefilename, reconstruction_time)

…which reads import_shapefilename, reconstructs it to reconstruction_time using one or more rotation_files and writes the results to export_shapefilename.

That’s because the geometries in the Shapefile have yet to be assigned plate IDs which, along with the rotation file, tell GPlates/pyGPlates how to rotate the geometries.

You’ll need some polygons with plate IDs to do that. You can assign plate IDs in GPlates (see Tutorial 1.5: Creating Features - example 3).

Alternatively you can assign plate IDs using pyGPlates (see Import geometries and assign plate IDs). Those examples assume you’re importing geometries from a text file, so you can ignore that part and just do:

assigned_features = pygplates.partition_into_plates(
    static_polygons_filename,
    rotation_files,
    import_shapefilename,
    properties_to_copy = [
        pygplates.PartitionProperty.reconstruction_plate_id,
        pygplates.PartitionProperty.valid_time_period],
    reconstruction_time = 0)

…and then save the assigned features back to your Shapefile (overwriting it):

pygplates.FeatureCollection(assigned_features).write(import_shapefilename)

Now that you have a Shapefile with assigned plate IDs, you can use the pygplates.reconstruct function as described above.

Normally feature collections store their geometries in present day coordinates. All the above code assumes this. So if your “from age” is not present day (and hence the geometries in your Shapefile are not at present day) then it’s best to reverse reconstruct your Shapefile to present day by doing something like this:

pygplates.reverse_reconstruct(import_shapefilename, rotation_files, from_age)

…which will read import_shapefilename, reverse reconstruct from from_age to 0 Ma and write the result back to import_shapefilename.

Note that you should do this after you assign plate IDs because the plate IDs also tell pyGPlates how to reverse reconstruct. So your final code could look like:

assigned_features = pygplates.partition_into_plates(
    static_polygons_filename,
    rotation_files,
    import_shapefilename,
    properties_to_copy = [
        pygplates.PartitionProperty.reconstruction_plate_id,
        pygplates.PartitionProperty.valid_time_period],
    reconstruction_time = from_age)

pygplates.reverse_reconstruct(assigned_features, rotation_files, from_age)

pygplates.FeatureCollection(assigned_features).write(import_shapefilename)

…note that reconstruction_time = from_age is specified in the call to pygplates.partition_into_plates, which reconstructs the static polygons to from_age and then partitions your geometries using that configuration (noting that your geometries are already at from_age). Then your geometries are reverse reconstructed back to present day (with the pygplates.reverse_reconstruct call).

Then you can reconstruct to any to_age and store the result in export_shapefilename.

pygplates.reconstruct(import_shapefilename, rotation_files, export_shapefilename, to_age)

Regards,
John

Thank you very much for the detailed answer , John !!

While executing the script I’m getting ‘memory error’ in partition_into_plates function call when the shapefile size is high. It’s because the process is trying to take more than 2 GB RAM for 32 bit python which isn’t allowed in Windows. Is there any way we can split the shapefiles and pass it to this function or if partition_into_plates can take care of it in some way?

Thanks,
Vishesh

You’re right about the 2GB limit (instead of 4GB). We use the LARGEADDRESSAWARE linker flag in pygplates but the 32-bit Python installation does not have it and hence limits to 2GB.

The best bet is to use a 64-bit Python installation. We now compile pygplates for 64-bit on Windows. This will be available in the next public release (for Python 2 and 3). I’ve sent you a PM with a version for testing.

Regards,
John

Hi John,

I tried rotating polygon shapefiles with pygplates and it’s output has two files, polylines and polygons. Is there any reason why pygplates gives two files as output in case on polygon shapefiles but only one file in case of point and polyline shapefiles?

I tried to process a points shapefile with size of 27.3 MB .shp and 1.21 GB .dbf with 64 bit python and pyglplates 3.7 but it failed with memory error. Server has 8 GB RAM and the python process took up to 7 GB RAM. Is there a way to process huge shapefiles?

Also,we are planning to install ArcGIS 2.5 with python 3.6.9 on the same server with python 3.7 with pygplates. Will there be any conflict issues you are aware of?

Thanks!

1 Like

When you assign plate IDs to polygons they can get partitioned into polygons and polylines. This is a limitation with the current GPlates (that pyGPlates inherits) - see the warning here. This will be dealt with in a future release though (so that only polygons are output).

And because Shapefiles can only store one type of geometry, you get two output files (polygons and polylines).

That 1.2GB of attributes gets expanded a bit by GPlates (in part due to its GML lineage). So that’s probably a big part of your 7GB memory usage. You could try to strip out some of those attributes (using ArcGIS or QGIS).

Usually only minor versions of Python (ie, the ‘x’ in ‘3.x.y’) are guaranteed to be ABI (binary) compatible unless a small subset of the Python API is used - see here - but this is not normally the case. So Python 3.6 and 3.7 are not generally ABI compatible. By the way, this is for C/C++ extension modules - pure python modules don’t have this issue. But pyGPlates is a C++ extension module so it has this ABI compatibility issue. Our limited experience with pygplates on Python 3.6 and 3.7, it seems to be compatible (ie, you can import a pygplates built against 3.6 into either 3.6 or 3.7 without error). However to be safe we’ll probably provide a separate package for each minor version (starting with 3.5).

The plan is for pyGPlates 1.0 to have proper Python package support and therefore be installable using PIP (via pre-built platform-specific wheels). Then pyGPlates will get installed into the actual Python (eg, you could install into ArcGIS’s Python 3.6.9, or into Python 3.7, or separately into both), and no longer need to set PYTHONPATH for pyGPlates as is currently the case. So running pip install pygplates on Python 3.6 would cause PIP to select the wheel pygplates-1.0.0-cp36-cp36m-win_amd64.whl, and running pip install pygplates on Python 3.7 would use pygplates-1.0.0-cp37-cp37m-win_amd64.whl. And that’s just on Windows. Macos would be similar (eg, pygplates-1.0.0-cp35-cp35m-macosx_10_9_x86_64.whl, etc).

I guess that’s a long-winded way of saying, once pygplates 1.0 is released, just run pip install pygplates. Or even better python -m pip install pygplates, where python is either your ArcGIS Python 3.6 or your Python 3.7.

Regards,
John

2 Likes

Vishesh,

in addition to John’s suggestion to remove attribute data another way to try to ease the load when dealing with big feature data is to rationalise which geometry type to chose.

For example if you have a large point data set each point gets its own xml represenation - so it blows up a fair bit as John explained. If you can turn your feature collection into a line or multipoint geometry you can combine many individual point features into a single geometry instead which should be much more efficient.

Cheers,
Christian

ps - John great to hear about the pyGplates v1 plans re pip etc!