23 March 2013

My RootsTech 2013: Day 2 - Ancestry.com breaking news, FHISO and GEDCOMX, handwriting recognition

 As always, a highlight of the day was the keynote. Today it was sponsored by Ancestry.com who took the opportunity to talk about family history collaboration in general, using online trees for collaboration, and--breaking news--their biggest partnership with FamilySearch yet:
  • Over the next 3 years, Ancestry.com will index approximately 140 million pages of probate records previously microfilmed by FamilySearch.
  • Ancestry.com also announced the new price ($99) for their DNA test. ($99 for everyone, not just subscribers as when they launched it)
 One of my acquaintances at FamilySearch, Ben Baker (seen on the left), invited me via e-mail to talk to him about his relationship calculator idea during the lunch hour. He was presenting his idea in the BYU Technology Workshop demo area outside room 260. My only regret was not knowing about the BYU Technology Workshop offerings sooner. Some of the "how to" complexities are lost on my user (non-developer) brain, but it was well worth my time. Most things were discussed in concept and user-friendly language.


The relationship calculator can be tweaked to include more or less relationship types, capped at a certain generation or left infinite; the main concern Ben seemed to have at this time is that his model would take too long to process information inside the FamilySearch database. He displayed two different algorithms he has tried.

 Randy Wilson of FamilySearch demos his ideas about working toward a model that could become a standard for data transfer (data transfer, think GEDCOMX) that is used for indexed record data. He showed a concept like part of Ancestry.com's attaching sources (the part when you choose which indexed data to merge into your tree). As a caveat, my favorite class of the day was the 4:15pm class on GEDCOMX, when Ryan Heaton of FamilySearch explained their vision to the point. Randy's concept of a model for indexed data was mentioned as a post-"GEDCOMX 1.0" priority by Ryan later.

In the FHISO (Family History Information Standards Organization) class at 3pm, many of us voiced the community's opinion that FHISO was working too slowly and we wanted to see FamilySearch working with them. To the credit of both FHISO and Ryan of GEDCOMX (FamilySearch), they mentioned each other in a friendly manner. The points in the FHISO discussion seemed not to be answered to the satisfaction of many present. For example, it was asked what we have to work with today regarding a better GEDCOM model. It was asked where we might read more specifics of a suggested model for any particular standards. While the answers there seemed soft and tentative, the GEDCOMX vision discussed in the next hour was well defined--an exampe of a modern and live (within FamilySearch) model that could be used by FHISO and altered to the community's needs. Ryan was enthusiastic about the idea of working with FHISO to adjust the "open" GEDCOMX. The complexities were interesting. It was mentioned that the name GEDCOMX is owned by FamilySearch and requires a citation crediting them, but the things needed to implement it are all open and may be altered without permission. When we asked Ryan about why FamilySearch had not officially joined FHISO, Ryan told us that he wanted the same thing and we should take it to upper management because his recommendation was previously dismissed. We do not know why the choice was made, so it is our job to try and find contact information for the right person (that isn't easy) and tell them we want to see an official partnership.

 Back to the BYU Tech. Workshop, at 145pm I attended the session depicted to the left. There were about three shorter talks, with all of them being pretty easy for me to understand except the last which was still interesting. The over-all take-away message (I think) was that FamilySearch is listening to the students at BYU and others about their research. They want to be able to use some automation soon, but it is still rather complex. Various models for handwriting recognition were discussed. Larger research organizations were mentioned. From June 2011 - March 2012 FamilySearch had an evaluation done to measure accuracy of one method. BYU students have done evaluations too. The general result was handwriting recognition between 50-90% accuracy. The 90% figure was from a single author study where the model words were taken from the same handwriting style (author) as the writing being tested against. Tests were closer to 70% accuracy when multiple authors were tested against. The variation in error was significant depending on method, type of record, etc. One of the FamilySearch representatives commented that the accuracy percentage is high enough for limited data fields like gender, marital status, and other simple census fields that such a thing could be used soon to at least provide a suggestion to an indexer/arbitrator. No specific future use plans were revealed, but the overall feel from the workshop and Dennis Brimhall's unconferencing comment (as I recall) is that FamilySearch will seriously look at using this is some form within the next year or two. No time-frame was mentioned, but that is my estimate based on sentiment of those in a position to decide. I think it is key that the community of bloggers and tech-loving-genealogists understand the realities so we do not expect too much too soon. Though it sounds realistic to have at least limited fields on form based records available as a key-A in an indexing experience (as a practical application example). Any time saved without introducing more error is good. People index with 7-8% error on average as FamilySearch reported in this workshop session. If some fields can be pre-filled at 90% accuracy based on today's technology by the computer then I say let us do it--but not as a final result (something to be arbitrated).

To the right, a BYU student shows his project on word recognition in headstones. He is in talks with BillionGraves whom I mentioned is looking at this kind of technology from my post yesterday. Perhaps this is the guy that prompted them looking into it. At any rate, we know it is being worked on by someone and that is exciting.

It turns out that all workshop ideas above are not fully implemented or ready for prime-time. That is easily the nature of a technology workshop from any cutting edge genealogy technologists.

My main take away is that people-who-know-more-than-I-do want the technologies I want just as much as I do, but they also understand the complexities more than I do. After hearing some of their complexities, I will be much less likely to complain to my blogger and other friends. Perhaps this should be a required experience so we all can understand more of the hard work and determination behind-the-scenes. Even though it is complex, they are closer every year to solving these issues--an exciting idea, and what RootsTech is about for people like me.

Miscelaneous
  • RootsMagic released update for RootsMagic 6 that is the first program to work directly with FamilySearch Family Tree instead of the dying new.FamilySearch.org
  • Between David Rencher's class yesterday (my favorite Thursday class) and the FGS RPAC (Records Preservation and Access Committee) unconferencing session today, I've enjoyed trying to stay in the loop with legislative issues. It is an important area to be involved and there is much opportunity to help.

8 comments:

  1. We do not know why the choice was made, so it is our job to try and find contact information for the right person (that isn't easy) and tell them we want to see an official partnership.

    Who is OUR in our job. Are you going to approach them? It seems rather strange that Ancestry and other members are IN, but Family Search is OUT.

    ReplyDelete
  2. many of us voiced the community's opinion that FHISO was working too slowly.

    From what I understand, FHISO is a bunch of dedicated genealogists with little/no actual support from the community. So being slow would be obvious and if the community wants it moving faster, they need to get in and support it.

    ReplyDelete
  3. Thank you for the discussion on FHISO and GEDCOMX.

    I finally figured out that FHISO doesn't plan to develop a GEDCOM replacement, but would endorse or embrace a GEDCOM replacement.

    What else was said about GEDCOMX? Is it being used in the FamilySearch API to enable interaction with the Family Tree?
    When will there be a working model?
    What does it presently include? How will they deal with sources?

    Any additional info would be very helpful and informative to me and others who didn't attend.

    ReplyDelete
  4. Alex,

    I have approached FamilySearch employees about the issue, but not found the person in charge yet. I will keep it on my to do list and work on it. I meant anyone interested by "our" like you or me. I ought to have asked for the contact info for the correct person before leaving RootsTech. It is much easier to get some things done (with talking to hard to find/reach employees) in person at the conference. That is one of my favorite things about actually going.

    ReplyDelete
  5. Alex,

    I agree that perhaps many of us (including myself) were a bit too judgmental of FHISO, and that their need to have community like us present things to them is valid. We can't expect them to fix our problems. It is just nice to know of a forum where I could send recommendations for standards and they'd actually have a chance of being reviewed by several major players.

    ReplyDelete
  6. Randy,
    Thanks for asking for more detail. Let me try. As I remember (fuzzy) Ryan Heaton said that GEDCOMX "1.0" will be out in spec form by the end of April I think. He promised to add more details about what GEDCOMX is to GEDCOMX.org in the future, and admitted that he knew he did not update it much--or at all--recently. It was asked what the worse case scenario is and he said it would be if no one adopts GEDCOMX, but FamilySearch would still use it. He said FamilySearch API already uses GEDCOMX specs and whenever a program like RootsMagic works with Family Tree they are using GEDCOMX behind the scenes. (Just with no user export option like most users think GEDCOMX would be). He also said that Ancestry.com's probate index will be transferred to them using GEDCOMX technology (as I understood). Ryan said this first "1.0" (used loosely) would incorporate certain things (he listed in a slide) and future versions would have more capabilities. For example, like I said, Randy Wilson's topic about transfer of index data from one platform to another.

    ReplyDelete
  7. Randy
    Where do you get the ##I finally figured out that FHISO doesn't plan to develop a GEDCOM replacement, but would endorse or embrace a GEDCOM replacement.## idea from????

    ReplyDelete
  8. Thanks for sharing - many of us geneabloggers rely on one another's session feedback to get the bigger pictur eof where things are heading. I was too busy many times meeting new folks for the first time and promoting Genedocs with new contact cards. Did you get a card with a puple tree on it? Less than 2000 were available. 10 Genedocs T-shirts handed out to key firsts folks will forever be collectable items too. Only Keynote Speaker to recieve one was Syd!

    - Eric / Genedocs Founder

    ReplyDelete

Thanks for your kind and thoughtful comments.