We're updating the issue view to help you get more done. 

entitySplitters broken on Morton


entitySplitters that were all verified as working last week are no longer working.

Normally in my mappings, I simply set the target to "ca_entities".. and as of last week this was getting the job done on Morton. Today, I noticed that entities were failing to be generated so I checked the log, which stated for each entity:

"2015-02-04 15:05:25 - ERROR --> Could not insert entity J./Agger: 'Agger, J.' is not valid for Identifier; only letters and numbers are allowed"

So it looks like it is trying to stick the source data in idno, which fails because idno is set to autogenerate. The refinery appears to no longer "split" the name up into forename/surname.

To troubleshoot, I changed the target in the mapping to "ca_entities.preferred_labels". This time around, the entities still failed to generated. Only this time, the log says:

"2015-02-04 15:19:49 - ERROR --> Could not insert entity J./Agger: Type must be specified"

The thing is, entityType is set correctly. I have confirmed this.

Could there have been recent changes that would cause the splitters to fail?





User known
February 4, 2015, 3:39 PM

Nothing has changed on Morton. Where did you verify them as working?

Jonathan Byerley
February 4, 2015, 3:48 PM

The splitters were all working between 1/21 and 1/29. I'm not sure which exact days I "greenlighted" each splitter, but I was testing in that range. I do know that Sophie ran pull for me late last week.. so maybe that made the change kick in?

User known
February 4, 2015, 4:02 PM

I didn't realize you did a pull. I didn't change 'em (I think). I'm looking now.

User known
February 4, 2015, 4:59 PM

There was a change to splitters that was done recently to simplify matching. What it does is automatically copy the mapped value into the idno field if idno is not otherwise set, not just preferred labels. This facilitates lookups on existing authority records (entities, etc.) by idno when merging different data sets. It made NY Society library easier to do but maybe it's a bit too obscure and magical behavior. I mean it really could just be a splitter option, or you could even just do the setting of idno manually.

In any event, if you (a) map to ca_entities (not ca_entities.preferred_labels as you had been doing; I assume you were just trying random things to get this to work?) and (b) explicitly set idno in the "attributes" block of the splitter to "%" (which is the placeholder for an auto generated value) it'll work.

Attached is a working mapping.

Jonathan Byerley
February 4, 2015, 5:23 PM

Oh, ok. Yeah, I only tried 'ca_entities.preferred_labels' as a troubleshooting measure when the log told me it was trying to stick the names in idno.

I'll revised all the other mappings with this change. Thanks!


User known


Jonathan Byerley