In April 2015, Apache Geode (incubating) was born from Pivotal’s GemFire, the distributed in-memory database. However, the donation of over 1M LOC was just the beginning of the journey. In this talk we discuss how the GemFire engineering team has adapted their development infrastructure, processes, and culture to embrace the “Apache Way". We present lessons learned and best practices for new and incubating open source projects in areas of initial code submission, IP clearance, governance policies, code review, and community building. We discuss the challenges the team faced and how we changed internal communication and software design processes to a community-driven model. In particular, we highlight effective strategies for growing a project community and embracing new members. Finally, we show how changing to the open source model has increased both productivity and quality.
3. How we transformed a hard-core
commercial engineering team into
an open source community-driven
powerhouse*
* forward-looking statement
4. What the government doesn’t want you to know about open
source
6 reasons to be addicted to open source
7 things Lady Gaga has in common with open source
8 unbelievable things you never knew about open source
Why you should give up sex and devote your life to open
source
11 ways investing in open source can make you a millionaire
http://www.contentrow.com/tools/link-bait-title-generator
Alternative titles?
8. Apache Geode is…
"…an in-memory, distributed database
with strong consistency
built to support low latency
transactional applications
at extreme scale.”
9. 2004 2008 2014
• Massive increase in data
volumes
• Falling margins per
transaction
• Increasing cost of IT
maintenance
• Need for elasticity in
systems
• Financial Services
Providers (every major
Wall Street bank)
• Department of Defense
• Real Time response needs
• Time to market constraints
• Need for flexible data
models across enterprise
• Distributed development
• Persistence + In-memory
• Global data visibility needs
• Fast Ingest needs for data
• Need to allow devices to
hook into enterprise data
• Always on
• Largest travel Portal
• Airlines
• Trade clearing
• Online gambling
• Largest Telcos
• Large mfrers
• Largest Payroll processor
• Auto insurance giants
• Largest rail systems on
earth
10. China Railway
Corporation
5,700 train stations
4.5 million tickets per day
20 million daily users
1.4 billion page views per day
40,000 visits per second
*http://pivotal.io/big-data/pivotal-gemfire
Indian Railways
7,000 stations
72,000 miles of track
23 million passengers daily
120,000 concurrent users
10,000 transactions per minute
11. World: ~7,349,000,000
~36% of the world population
Population: 1,251,695,6161,401,586,609
China Railway
Corporation
Indian Railways
12. Application patterns
• Caching for speed and scale using read-through,
write-through, and write-behind
• OLTP system of record with in-memory for speed,
on disk for durability
• Parallel compute grid
• Real-time analytics
14. Some context
• 1M+ LOC, 1000’s of customers running business
critical systems in production
• More than 12 years in development
• Large, multi-geo engineering team
• Established development practices
15. Why OSS? Why ASF?
• Open source is fundamentally changing software
buying patterns
• Customers avoid vendor lock-in and get
transparency, co-development of features
• It’s the community that matters
• ASF provides a framework for open source
19. Cleaning up the source
• Get rid of internal dependencies
• Make sure the build is easy and fast
• Remove embarrassing source comments :-)
• Make testing easy
20. Writing the proposal
• Use a prior project as a template (lots of options)
• Identify background, rational, status, comparables,
risks
• Who are the committers?
21. Submitting the proposal
• Is your source available? Not necessary, but
helpful!
• Discussion will occur on
general@incubator.apache.org
• Wait for consensus and vote :-)
26. http://what’s important
• Brief product description
• Community coordinates
• Email lists (dev, user), JIRA, wiki
• Getting started (one page)
• Building and running the product
• How to obtain the source (no binaries before a release)
• Roadmap, things to work on
29. http://theapacheway.com
“The Apache Way is sort of like Zen. It's
something that's difficult to explain, has many
interpretations, and the best way to learn it is to
do it.”
31. The Apache Way
• Community over code
• If it didn’t happen on the mailing list, it didn’t
happen
• Rough consensus and working code: do-ocracy
• Decision model: +1, 0, -1 (also lazy)
41. Include your customers
• Increase their opportunity to interact directly with
committers
• Do they want to contribute existing tooling?
• Questions they will ask:
• Are you still committed to the product?
• Why should I pay?
42. Community activities
• ApacheCon! (and other conferences)
• OSCon, SpringOne2GX, QCon, …
• Local meetup groups
• Portland, Toronto, Palo Alto, San Francisco, London,
Cork, Pune
• Virtual meetups (Geode Clubhouse) 2x / month
• Blogs, twitter, hackathon
43. Virtual meetups
• Two formats
• Technical deep-dive on a
specific subject
• Open mike, like a “standup”
where participants get to
bring up topics for
discussion
• Note: any decisions must still
be published on the dev list!
44. Be responsive
• More responsive and interactive communities have
better engagement and retention
• On a temporary basis, find community activists to
ensure questions, PR’s, and bugs are addressed
quickly
• Soon this behavior becomes automatic