Family Search Developer Conference - Challenges and Opportunities | PPT

Family Search Developer Conference - Challenges and Opportunities

•Download as PPTX, PDF•

1 like•413 views

The document contains code snippets and performance data related to interacting with a family tree data structure and API. It includes code to add a person to a family tree, authenticate to access a person's data, retrieve a person with their relationships, and performance metrics for different family tree operations like retrieving a person and pedigree.

Family Search Developer Conference - Challenges and Opportunities

90th Percentile
Times
Family Tree
(1x, w/ CIS)
POC2
(1x)
POC2
(10x)
POC2
(20x)
Person &
Relationships
1730 ms 48 ms 33 ms 44 ms
Pedigree 2700 ms 18 ms 36 ms 49 ms
Pedigree Extend 2580 ms 12 ms 27 ms 35 ms
Person Card 293 ms 14 ms 10 ms 10 ms
Change History 5800 ms 35 ms 38 ms 50 ms
Change History
6000 ms 21 ms 10 ms 11 ms
Summary

Cluster Nodes Type RAM Storage
Cassandra 1.2.4 25-30 hi1.4xlarge 1.7 TB 58 TB (SSD)
Eureka App 30-140 m3.2xlarge 1.0 TB -

FamilySearchFamilyTree ft = ...;
//add a person
PersonState person= ft.addPerson(new Person()
//named John Smith
.name(new Name("John Smith", new NamePart(NamePartType.Given, "John"), new NamePart(NamePartType.Surname, "Smith")))
//male
.gender(GenderType.Male)
//born in chicago in 1920
.fact(new Fact(FactType.Birth, "1 January 1920", "Chicago, Illinois"))
//died in new york 1980
.fact(new Fact(FactType.Death, "1 January 1980", "New York, New York"));

String username = "...";
String password = "...";
String developerKey = "...";
String ark = ...; //e.g. "https://familysearch.org/ark:/61903/4:1:KW8W-RF8"
FamilyTreePersonState person = new
FamilyTreePersonState(URI.create(ark))
.authenticateViaOAuth2Password(username, password,
developerKey);

FamilySearch.getPersonWithRelationships('KW7S-VQJ', {persons:true}).then(function(response) {
console.log(response.getPrimaryPerson().getName());
var spouses = response.getSpouses();
for (var s = 0; spousesLen = spouses.length; s < spousesLen; s++) {
console.log(spouses[s].getName());
var children = response.getChildren(spouses[s].getId());
for (var c = 0; childrenLen = children.length; c < childrenLen; c++) {
console.log(children[c].getName());
}
}
});

Family Search Developer Conference - Challenges and Opportunities

14. 90th Percentile Times Family Tree (1x, w/ CIS) POC2 (1x) POC2 (10x) POC2 (20x) Person & Relationships 1730 ms 48 ms 33 ms 44 ms Pedigree 2700 ms 18 ms 36 ms 49 ms Pedigree Extend 2580 ms 12 ms 27 ms 35 ms Person Card 293 ms 14 ms 10 ms 10 ms Change History 5800 ms 35 ms 38 ms 50 ms Change History 6000 ms 21 ms 10 ms 11 ms Summary

15. Cluster Nodes Type RAM Storage Cassandra 1.2.4 25-30 hi1.4xlarge 1.7 TB 58 TB (SSD) Eureka App 30-140 m3.2xlarge 1.0 TB -

19. FamilySearchFamilyTree ft = ...; //add a person PersonState person= ft.addPerson(new Person() //named John Smith .name(new Name("John Smith", new NamePart(NamePartType.Given, "John"), new NamePart(NamePartType.Surname, "Smith"))) //male .gender(GenderType.Male) //born in chicago in 1920 .fact(new Fact(FactType.Birth, "1 January 1920", "Chicago, Illinois")) //died in new york 1980 .fact(new Fact(FactType.Death, "1 January 1980", "New York, New York"));

20. String username = "..."; String password = "..."; String developerKey = "..."; String ark = ...; //e.g. "https://familysearch.org/ark:/61903/4:1:KW8W-RF8" FamilyTreePersonState person = new FamilyTreePersonState(URI.create(ark)) .authenticateViaOAuth2Password(username, password, developerKey);

21. FamilySearch.getPersonWithRelationships('KW7S-VQJ', {persons:true}).then(function(response) { console.log(response.getPrimaryPerson().getName()); var spouses = response.getSpouses(); for (var s = 0; spousesLen = spouses.length; s < spousesLen; s++) { console.log(spouses[s].getName()); var children = response.getChildren(spouses[s].getId()); for (var c = 0; childrenLen = children.length; c < childrenLen; c++) { console.log(children[c].getName()); } } });

Editor's Notes

Our data problems are one type of big data problem. We have lots of data, but at least they are mostly of a known structure. Still, the problem of dealing with that much data in a way that allows rapid access and update is a challenge. Our Pedigree database is now North of 15 TB. We have more than 20 TB of historical record data. Our record images at the relatively low resolution and quality level that we publish on the Internet still amount to more than 2 PB. And access rates continue to increase to support more users, more uses. Having the ability to scale our systems up and down is crucial to handling these challenges.
We can’t afford to have system outages cause our patrons and your customers to wait, or worse, to be denied access. High redundancy built into our systems at every component level is a must. To achieve this, we need a scalable, dynamic, and highly resilient environment for our systems to run in. Hence the aws cloud.
Getting our customers inside our applications, or inside yours that depend on our system availability is crucial to us. Once there, we need to be able to give each person access to the information available, and do it in a way that keeps their interest, satisfies their desire, and brings them back for more.
Besides localizing our applications into 40 languages, we are trying to understand what makes FH interesting and relevant in various parts of the world. Future support for Jai Pu, or for clan association are some things we are considering.
Jai Pu represent genealogical data rather differently than how we visualize it in the western world. Direct support for such visualization enhances the cultural acceptance.
We already have a world-wide reach with our products. However, they are very much centrally run in the Eastern United States. The experience for someone accessing our site across the Internet from Kenya is much different than from the Western Hemisphere. Solutions include replicating and distributing our systems into regional centers to place them closer to the patrons.
Throughout a 24 hour period, our user load ranges between 8000 and 32000. The trend is sloping upwards at a good clip. Add to that the user generated traffic that hits our site from partners’ tools and the load becomes quite tricky to manage. We keep adding infrastructure elements to help out, but it’s difficult to keep up. And besides, this represents a four-fold increase between lowest and highest traffic within 24 hours. Sizing for peak X k is very expensive. Here again, dynamic infrastructure can help us.
While we have had large parts of our web site running in the Amazon cloud for better than four years, we are now moving more and more workloads into the cloud. Let’s take a look at some of the things we are seeing with this.
Ability to get compute on demand enables us to do this job in reasonable time. Pulling some tricks such as spot instances saves us lots of money.
Dynamic infrastructure enables new approaches to large data problems such as ETL.
Accessibility here means programmer accessibility to our APIs or services. Not everyone is interested in managing HTTP and understanding HATEOAS
Be sure to attend Dallan’s talk Thursday on “Javascript SDK” and Friday on “Javascript Reference Client”
Of course there have been many attempts at some of this: Gedcom, GedML, Gentech… But none enjoys wide and modern support while representing the types of data and media that are common today. This is why we created GedcomX. Is it a standard? Perhaps not in the strictest sense. But we would like to propose it as a working model toward such a standard.
As a data transfer specification, GedcomX facilitates computer reasoning on both conclusions and source data.
Because we have a common way to express historical record data and conclusion data, we can write programs to reason about them.

�ݺ�ߣ

Family Search Developer Conference - Challenges and Opportunities

More Related Content

Family Search Developer Conference - Challenges and Opportunities

Editor's Notes