Pleiades: News and Views

http://planet.atlantides.org/pleiades

Tom Elliott (tom.elliott@nyu.edu)

This feed aggregator is part of the Planet Atlantides constellation. Its current content is available in multiple webfeed formats, including Atom, RSS/RDF and RSS 1.0. The subscription list is also available in OPML and as a FOAF Roll. All content is assumed to be the intellectual property of the originators unless they indicate otherwise.

September 04, 2008

Horothesia (Tom Elliott)

BAtlas IDs: first full release (all maps)

Grab the whole thing here: http://atlantides.org/batlas/2008-09-04/baids-2008-09-04.tgz

Let me know what problems you find.

README file for Barrington Atlas Identifiers, version published 2008-09-04
This is the first complete release.
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 100, 101, 102
List of all maps presently covered: 1-102 (complete)

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.
  • No changes to previously released IDs.

September 03, 2008

Horothesia (Tom Elliott)

BAtlas ID update: Maps 1-6 and 65

README file for Barrington Atlas Identifiers, version published 2008-09-03
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 1, 1a, 2, 3, 4, 5, 6, 64
List of all maps presently covered: 1-99

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.

  • No changes to previously released IDs.
  • Note that map 64 was erroneously listed as included in previous releases, but was not present. This difficiency is corrected with this release.

August 28, 2008

Horothesia (Tom Elliott)

Barrington Atlas ID update: maps 89-99

README file for Barrington Atlas Identifiers, version published 2008-08-28
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99
List of all maps presently covered: 7-99

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.
  • No changes to previously released IDs.

August 20, 2008

Horothesia (Tom Elliott)

BAtlas ID Update: Maps 28-34, 67-71, 81-83

README file for Barrington Atlas Identifiers, version published 2008-08-20
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 28, 29, 30, 31, 32, 33, 34, 67, 68, 69, 70, 71, 81, 82, 83
List of all maps presently covered: 7-88

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.

* No changes to previously released IDs.

August 19, 2008

Horothesia (Tom Elliott)

BAtlas ID update: Maps 7-9, 26-27

README file for Barrington Atlas Identifiers, version published 2008-08-19
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 7, 8, 9, 26, 27
List of all maps presently covered: 7-27, 35-66, 72-80, 84-88

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.
  • No changes to previously released IDs.

August 18, 2008

Horothesia (Tom Elliott)

BAtlas ID update: Maps 19, 41-48

README file for Barrington Atlas Identifiers, version published 2008-08-15
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 19, 41, 42, 43, 44, 45, 46, 47, 48
List of all maps presently covered: 10-25, 35-66, 72-80, 84-88

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.
  • No changes to previously released IDs.

August 08, 2008

Horothesia (Tom Elliott)

BAtlas ID update: Maps 14-18, 24, 25, 39, 40

README file for Barrington Atlas Identifiers, version published 2008-08-08
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 14, 15, 16, 17, 18, 24, 25, 39, 40
List of all maps presently covered: 10-18, 20-25, 35-40, 49-65, 72-80, 84-88

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.

* No changes to previously released IDs.

August 05, 2008

Horothesia (Tom Elliott)

BAtlas IDs: Maps 10-13, 20-21, 49

README file for Barrington Atlas Identifiers, version published 2008-08-05
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 10, 11, 12, 13, 20, 21, 49
List of all maps presently covered: 10, 11, 12, 13, 20, 21, 22, 23, 35, 36, 37, 38, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 65, 72, 73, 74, 75, 76, 77, 78, 79, 80, 84, 85, 86, 87, 87 inset, 88

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.

* No changes to previously released IDs.

August 04, 2008

Horothesia (Tom Elliott)

BAtlas ID update: maps 23, 84, 85, 87, 87 inset, 88 and fixed dates

README file for Barrington Atlas Identifiers, version published 2008-08-04
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 23, 84, 85, 87, 87 inset, 88
List of all maps presently covered: 22, 23, 35, 36, 37, 38, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 65, 72, 73, 74, 75, 76, 77, 78, 79, 80, 84, 85, 86, 87, 87 inset, 88

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.
  • All readme files, dated folders and compressed tar files have been modified and renamed as necessary to redress the erroneous substitution of 2007 for 2008. No changes to IDs have occurred.

August 01, 2008

Horothesia (Tom Elliott)

Hidden Web: Don't Love It, Leave It

There's been a bit of buzz lately about Google's "failure" to effectively search the "hidden (deep) web". In the discussions I've been seeing, the hidden web is equated with stuff in academic and digital library repositories, i.e., "OAI-based resources" (which I assume to mean OAI/PMH).

I have to say: repositories != hidden web. The hidden web is simply the stuff the search engines don't find. Systems that surface information about their content only through OAI/PMH interfaces might make up a small part of the hidden web because they're not being surfaced to the bots, but frankly the hidden web holds way more stuff than what's in Fedora and DSpace at universities. Just ask Wikipedia.

The assertion that repository content == the hidden web is circular and false rhetoric that obscures the real problem: people are fighting the web instead of working with it. If you fight it, it will ignore you. This sort of thinking also makes hay for enterprises like the Internet Search Environment Number that seem to me to be trying to carve out business models that exploit, perpetuate and promote the cloistering of content and the rationing of information discovery.

Yesterday, Peter Millington posted what's effectively the antidote on the JISC-REPOSITORIES list (cross-posted to other lists). I reproduce it here in full because it's good advice not just for repositories but for anybody who is putting complex collections of content on the web and wants that content to be discoverable and useful:
Ways to snatch defeat from the jaws of victory
Peter Millington
SHERPA Technical Development Officer
University of Nottingham

You may have set up your repository and filled it with interesting papers, but it is still possible to screw things up technically so that search engines and harvesters cannot index your material. Here are seven common gotchas spotted by SHERPA:
  1. Require all visitors to have a username and password
  2. Do not have a 'Browse' interface with hyperlinks between pages
  3. Set a 'robots.txt' file and/or use 'robots' meta tags in HTML headers that prevent search engine crawling
  4. Restrict access to embargoed and/or other (selected) full texts
  5. Accept poor quality or restrictive PDF files
  6. Hide your OAI Base URL
  7. Have awkward URLs
Full explanations and some solutions are given at: http://www.sherpa.ac.uk/documents/ways-to-screw-up.html

If you know of any other ways in which things may go awry, please contact us and we will consider adding them to the list.
I'm happy to say: Pleiades gets a clean bill of health if we count nos. 5 and 6 as non-applicable (since we're not a repository per se and we don't have a compelling use case for OAI/PMH or PDF).

Disclaimer: we are exploring the use of OAI/ORE through our Concordia project. One of the things we like most about it is that its primary serialization format is Atom, which is already indexed by the big search engines. With the web.

July 26, 2008

Horothesia (Tom Elliott)

BAtlas IDs: 4 more sets in Asia Minor, plus Cyprus

README file for Barrington Atlas Identifiers, version published 2007-07-26
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 62, 63, 66, 72, 86
List of all maps presently covered: 22, 35, 36, 37, 38, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 65, 72, 73, 74, 75, 76, 77, 78, 79, 80, 86

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.

* No changes to previously issued files in this release

July 22, 2008

Horothesia (Tom Elliott)

BAtlas IDs: 10 more maps

README file for Barrington Atlas Identifiers, version published 2007-07-22
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered in this release: 50, 51, 52, 53, 54, 56, 57, 58, 59, 60
List of all maps presently covered: 22, 35, 36, 37, 38, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 65, 73, 74, 75, 76, 77, 78, 79, 80

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.

* Still suppressing ID creation for roads; have also added suppression for "coastal change" (so far only seen in Map 53)
* No changes to previously issued files in this release

Kostis Kourelis on GIS and Greek history

Kostis Kourelis has recently posted the third entry (Mapping Eleia) in an interesting series on toponyms, archaeology and GIS. There's no distinctive tag, so I'll explicitly link to the two previous posts as well:
He makes some useful observations about the challenges of working with historical GIS, including sourcing spatial data, tracking name changes and appropriations, addressing the shifting locations of particular conceptual places, navigating the influence of politics (both past and present) and handling issues of languages, scripts and transliteration.

Apropos language, script and transliteration there are many things to consider. It's in this domain of course that the simple, flat-file approach to GIS often breaks down, particularly given the problems of change over time and of fragmentary witnesses. For Pleiades -- which was designed with an object-oriented data model -- we record individual name variants, each of which has a language-and-script combination, an "original script" representation (using Unicode) and a transliteration. We can have as many names assigned to a given "place" as we have variants (or theories).

We have our own slightly idiosyncratic transliteration scheme for classical Greek (inherited from the Barrington Atlas; one of the major benefits of Pleiades is the ability to add back the original orthography). We could easily add multiple transliteration schemes (and the corresponding strings generated programmatically from the Unicode Greek). We may well need such a development when we move, as we eventually must, to include historical toponymy in Arabic (where both past and present variation in transliteration schemes renders even recent bibliography a veritable maze).

If you're trying to do this ArcGIS, you'll probably have to set up a relational database or manage a series of joined tables manually.

July 21, 2008

Horothesia (Tom Elliott)

BAtlas IDs update: IDs for 9 more maps

README file for Barrington Atlas Identifiers, version published 2007-07-21
Reference URL: http://atlantides.org/batlas

Background: http://horothesia.blogspot.com/search/label/batlasids
New maps covered: 35, 36, 74, 75, 76, 77, 78, 79, 80

Major classes of change from prior versions are listed below. Consult individual files named like map22-diff.txt for output files differencing from prior version to this version.

* No changes from prior versions beyond the addition of data for new maps.

Sean Gillies Blog

Linking open geographic data

Before I read Bjørn Sandvik's post about semantic geospatial services, I didn't know there was an effort to build bridges from WS-* to the Semantic Web. More significantly, his post is a useful introduction to semantic web issues to GIS users who are more familiar with the concept of services than the concept of Web resources. The metadata in OGC service capability documents (abstract, keywords, etc) are inadequate for the purpose of machine reasoning, and application of RDF and OWL could certainly make it more clear what a service is about.

There is a different effort already underway to semantically link open datasets called "Linked Data" [wikipedia, http://linkeddata.org] or "Linking Open Data" [W3C]. This service-free, or at least, service-agnostic, approach is the one I find more compelling for Pleiades and Concordia. The diagram below (from Richard Cyganiak) shows how geographic data courtesy of GeoNames is right at the heart of the Linked data project.

http://richard.cyganiak.de/2007/10/lod/lod-datasets_2008-03-31.png

This graph of data has grown by leaps and bounds since we pitched Concordia. We didn't initially propose to join the Linked Data project, but I'd really like to see Ancient World datasets link themselves together in this way.

July 18, 2008

Horothesia (Tom Elliott)

BAtlas ID update: add map 55, more fixes

http://atlantides.org/batlas

Readme:

  • Eliminated duplication/collision problems with alias id values. Although the combination of label + map number + grid square is almost perfectly unique across the entire atlas, the same cannot be said for features not appearing on the maps (like "unlocated" and "false" toponyms), nor for alternative aliases, such as those created for individual constituent names in a multi-name label. Where possible, these collisions are eliminated by adding a one-up numbering scheme (postfixed) on the id, or by omitting non-primary alias alternatives where necessary. The one-up postfix numbers are also reflected in matching captions (in parentheses).
  • Corrected initial two-capitals error in geographic names and associated captions generated when a parenthetical variant indication leads the toponym (e.g., (L)Ibida, which should produce the variants "Libida" and "Ibida", not "LIbida" and "Ibida".
  • Captions for "group" features now read like "aqueduct group" instead of "aqueduct-group".
  • Suppress serialization of a few redundant captions

July 15, 2008

Horothesia (Tom Elliott)

BAtlas ID update: add map 37, various fixes

Latest XML in http://atlantides.org/batlas/

I've added a read-me file, as well as text files containing diffs between previous and current versions of the individual xml files. The readme file says:
  • altered citations so that location descriptions for unnamed features are enclosed in parentheses
  • fixed bug in processing of fragmentary, unreconstructable geognames so that lacunae are signaled with parentheses around ellipsis (...) instead of ellipsis alone; also, mark them correctly as completeness="non-reconstructable" instead of type="variant"
  • remove inverted quotes from geogname variants and instead mark them as accuracy="inaccurate"
  • handle group notation in location description for unnamed features like aqueducts and villas so that, e.g., Map-by-Map directory entry in Aqueducts for map 22 C5 with location description "Nicopolis ad Istrum (2)" becomes "aqueduct-group-nicopolis-ad-istrum-22-c5" instead of "aqueduct-nicopolis-ad-istrum-2-22-c5"; this also adds a new element child of indicating the number of features associated with the group.

July 14, 2008

Horothesia (Tom Elliott)

Barrington Atlas IDs

Update: follow the batlasids tag trail for follow-ups.

Back in February, I blogged about clean URLs and feed aggregation. In March, we learned about the ORE specification for mapping resource aggregations in Atom XML, just as we were gearing up to start work on the Concordia project, with support from the US National Endowment for the Humanities and the UK Joint Information Services Committee.

Our first workshop was held in May. One of the major outcomes was a to-do for me: provide a set of stable identifiers for every citable geographic feature in the Barrington Atlas so collaborators could start publishing resource maps and building interoperation services right away, without waiting for the full build-out of Pleiades content (which will take some time).

The first fruits can be downloaded at: http://atlantides.org/batlas/ . All content under that URL is licensed cc-by. Back versions are in dated subdirectories.

There you'll find XML files for 3 of the Atlas maps (22, 38 and 65). There's only one feature class for which we don't provide IDs: roads. More on why not another time. I'll be adding files for more of the maps as quickly as I can, beginning with Egypt and the north African coast west from the Nile delta to Tripolitania (the Concordia "study area"). Our aim is full coverage for the Atlas within the next few months.

What do you get in the files?


IDs (aka aliases) for every citable geographic feature in the Barrington Atlas. For example:
  • BAtlas 65 G2 Ouasada = ouasada-65-g2
If you combine one of these aliases with the "uribase" also listed in the file (http://atlantides.org/batlas/) you get a Uniform Resource Identifier for that feature (this should answer Sebastian Heath's question).

For features with multiple names, we provide multiple aliases to facilitate ease of use for our collaborators. For example, for BAtlas 65 A2 Aphrodisias/Ninoe, any of the following aliases are valid:
  • aphrodisias-ninoe-65-a2
  • aphrodisias-65-a2
  • ninoe-65-a2
Features labeled in the Atlas with only a number are also handled. For example, BAtlas 38 C1 no. 9 is glossed in the Map-by-Map Directory with the location description (modern names): "Siret el-Giamel/Gasrin di Beida". So, we produce the following aliases, all valid:
  • (9)-38-c1
  • (9)-siret-el-giamel-gasrin-di-beida-38-c1
  • (9)-siret-el-giamel-38-c1
  • (9)-gasrin-di-beida-38-c1
Most unlabeled historical/cultural features also get identifiers. For example:
  • Unnamed aqueduct at Laodicea ad Lycum in BAtlas 65 B2 = aqueduct-laodicea-ad-lycum-65-b2
  • Unnamed bridge at Valerian in BAtlas 22 B5 = bridge-valeriana-22-b5
Unlocated toponyms and false names (appearing only in the Map-by-Map Directory) get treated like this:
  • BAtlas 22 unlocated Acrae = acrae-22-unlocated
  • BAtlas 38 unlocated Ampelos/Ampelontes? = ampelos-ampelontes-38-unlocated = ampelos-38-unlocated = ampelontes-38-unlocated
  • BAtlas 65 false name ‘Itoana’ = itoana-65-false
The XML files also provide associated lists of geographic names, formatted BAtlas citations and other information useful for searching, indexing and correlating these entries with your own existing datasets. What you don't get is coordinates. That's what the Pleiades legacy data conversion work is for, and it's a slower and more expensive process.

Read on to find out how you can start using these identifiers now, and get links to the corresponding Pleiades data automatically as it comes on line over time.

Why do we need these identifiers?


Separate digital projects would like to be able to refer unambiguously to any ancient Greek or Roman geographic feature using a consistent, machine-actionable scheme. The Barrington Atlas is a stable, published resource that can provide this basis if we construct the corresponding IDs.

Even without coordinates, other projects can begin to interoperate with each other immediately, as long as they have a common scheme of identifiers. After using BAtlas URIs to normalize, control or annotate their geographic description, they can publish services or crosswalks that provide links for the relationships within and between their datasets. For example, for each record in a database of coins you might like links to all the other coins minted by the same city, or to digital versions (in other databases) of papyrus documents and inscriptions found at that site.

Moreover, we would like other projects to start using a consistent identifier scheme now, so that as Pleiades adds content we can build more interoperation around it (e.g., dynamic mapping, coordinate lookup, proximity search across multiple collections). To that end, Pleiades will provide redirects (303 see other) from Barrington Atlas URIs (following the scheme described here) as follows:
  • If a corresponding entry exists in Pleiades, the web browser will be redirected to that Pleiades page automatically
  • If there is not yet a corresponding entry in Pleiades, the web browser will be redirected to an HTML page providing a full human-readable citation of the Atlas, as well as information about this service
So, for example:
  • http://atlantides.org/batlas/aphrodisias-ninoe-65-a2 will re-direct to http://pleiades.stoa.org/places/638753
  • http://atlantides.org/batlas/vlahii-22-e4 will re-direct to http://atlantides.org/batlas/vlahii-22-e4.html until there is a corresponding Pleiades record
The HTML landing pages for non-Pleiades redirects are not in place yet, but we're working on it. We'll post again when that's working.

Why URIs for a discretely citable feature in a real-world, printed atlas?

I'll let Bizer, Cyganiak and Heath explain the naming of resources with URI references. In the parlance of "Linked Data on the Web," Barrington Atlas features are "non-information resources"; that is, they are non-digital/real-world discrete entities about which web authors and services may want to make assertions or around which to perform operations. What we are doing is creating a stable system for identifying and citing these resources so that those assertions and operations can be automated using standards-compliant web mechanisms and applications. The HTML pages to which web browsers will be automatically redirected constitute "information resources" that describe the "non-information resources" identified by the original URIs.

How

If I get a comment box full of requests for a blow-by-blow description of the algorithm, I'll post something on that. If you're really curious and energetic, have a look at the code. It's intended mostly for short-term, internal use, so it's not marvelously documented. Yes, it's a hack.

One of the big headaches was deciding how to decompose the complex labels into simple, clean ASCII strings that can be legal URL components. Sean blogged about that, and wrote some code to do it, shortly after the workshop.

Credit where credit is due

Sean and I had a lot of help from the workshop participants (Ben Armintor, Gabriel Bodard, Hugh Cayless, Sebastian Heath, Tim Libert, Sebastian Rahtz and Charlotte Roueché) in sorting out what to do here. Older, substantive conversations that informed this process (with these folks and others; notably Rob Chavez, Greg Crane, Ruth Mostern, Dan Pett, Ross Scaife†, Patrick Sims-Williams, Linda Smith and Neel Smith) go back as far as 2000, shortly after the Atlas was published.

Many thanks to all!

Examples
in the Wild


Sebastian Rahtz has already mocked up an example service for the Lexicon of Greek Personal Names. It takes a BAtlas alias and returns you all the name records in their system that are associated with the corresponding place. So, for example:
  • http://clas-lgpn2.class.ox.ac.uk/batlas/aloros-50-b3
This is just one of several services that LGPN is developing. See the LGPN web services page, as well as the LGPN presentation to the Digital Classicist Seminar in London last month.

Sebastian Heath, for some time, has been incorporating Pleiades identifiers into the database records of the American Numismatic Society. He has blogged about that work in the context of Concordia.

Do you have an application? Let me know!

July 13, 2008

Horothesia (Tom Elliott)

BAtlas ID update: add map 73, revise captions for rivers, islands and island groups

I have just updated the XML files providing Barrington Atlas IDs and associated information (background). The following additions and changes were made
  • IDs added for Map 73 (Ammon)
  • All files refreshed so that alternative captions for rivers, islands and island groups with multiple names all carry the appropriate formulaic postfix expression (fl., Ins. Inss.); no alias IDs have been changed; no geogname elements have been changed; no features have been added or removed
Copies of the prior versions are available for reference in an appropriately-dated subdirectory, e.g.: http://atlantides.org/batlas/2007-07-11/

June 12, 2008

Sean Gillies Blog

Penstemonium

This is P. Strictus, perhaps the most beautiful perennial wildflower of the Mountain West, just beginning to bloom today. We've got several of these around the yard, grown from seed I collected near Granby, CO in 2006. The neighborhood bees are also big fans, and last year inspired our little toddler to yell, "the bees are going in the tunnels"! Indeed. I'm rather pleased at how this shot turned out.

http://zcologia.com/images/penstemon-strictus.jpg

In the background you can see a crimson eruption of P. eatonii, a native of the Colorado Plateau.

In other news, my wonderful gig at UNC is up. I've got a break before my next one starts (continuing Pleiades and getting more into digital humanities), and plan to spend some of it on vacation, some of it in the garden, some of it getting back into home brewing (beer for sure, electronics maybe a little), and some of it on cool Web projects.