Functional Ideas for a Cloudy Future

•

2 gefällt mir•981 views

Richard Minerich

Slides from my GLFPC 2012 Keynote

Technologie Business

Functional Ideas for
a Cloudy Future
Richard Minerich
@Rickasaurus
Senior Researcher
at Bayard Rock

Properties of FP?
(It Depends on Who You Ask)

- First Class Functions
- Currying, Composition, Combinators
- Low Level Abstraction, Metaprogramming
- Immutability, Fancy Types, Constraints
- Fast Tail Recursion, Scope Minimization

The Spectrum of Functional

Convenience “FP” Constraints
Make Life Easy Now Make Life Easy Later

Referential Transparency
- It's all about scope!
- Mutation only infects in so far as it’s scope
- Global variables can be ok, if your referential
transparency scope is a process
- This can be function, class, thread, process, or
even a whole computer

What is Functional Programming?
- Complementary convenience and constraints
- A highly constrained set of approaches to
programing
- Where you lose in order to gain
- Low level constraints that propagate upwards
to the top level of your program

Program Scope over my Career
• Largest scope was usually a process with one
thread
• Then a process with a few threads
• Then a process with many threads
• Then a few machines
• Now a ton of machines

We need to scale out
• Desktop apps are going away
• Hosted hardware is on the way out
• No one cares about little data
- But! -
• Old algorithms don’t generalize well
• New tradeoffs between speed and scope
• Too many costs to keep track of

Thinking about Resource Costs

Far Machines Far Network
Machines Network
Processes Disk
Threads Memory
Instructions Cache

What is Cloud Computing?
- More than just a sneaky way to charge a ton
for hosting

- Paradigms that simply resource management
- You always lose in order to gain
- High level constraints that propagate
downward into your subtasks

Papers Published Over Time
(Microsoft Academic Search April 2012)

“Cloud Computing”

“Type System”

Properties of Cloud Computing
- Resources (Network, Disk, Memory, Cache)
- What constraints can make this easier?
- Force everything into one of a few styles of
computation?
- What if want we want to do is still possible but
doesn't fit our cluster’s paradigm?
- Where's the escape hatch?

Cloud Computing Methodologies
(Warning, Gross oversimplifications ahead!)

- MPI (Fixed Processes)
(OpenMPI, Tempest)
- Agents (Dynamic Processes)
(Erlang, Parallel Haskell, Akka)
- MapReduce (More Like Collect-GroupBy-Fold)
(Hadoop, Google)
- Others/Hybrids
(Iterative Map-Reduce, Mesos/Spark)

From: Flexible and Efficient Distributed Resolution of Large Entities
(Molnar, et al.)

This is Word Count.
Seriously.
63 Lines!

Hadoop
(without losing your mind)

- Pig/Hive if your problem is simple
- Scoobi with Scala
- Scalding (on Cascading) on Scala
- F# + .NET API once Microsoft Ships

From the Pangool Website: http://pangool.net/benchmark.html

Which to Pick?
MPI/Agents =
Difficult to get right, Extremely Powerful
MapReduce =
Limiting, Easier to use, Robust to failures
Middle Road =
Iterative MapReduce, Mesos/Spark

Cloud Computing is
Functional Programming

- Can’t Escape Referential Transparency
- Simple Composition is Key to Small Programs
- Object Oriented: a Square Peg in a Round Hole

Thanks for Listening!
Any Questions?
Visit my blog for ants and rants:
RichardMinerich.com
Follow me on Twitter:
@Rickasaurus
Come to NYC for the SkillsMatter F# Tutorials
June 5th and 6Th: is.gd/fsharptutorials

Weitere ähnliche Inhalte

Ähnlich wie Functional Ideas for a Cloudy Future

Spark

Nitish Upreti

SQL or NoSQL, that is the question!

Andraz Tori

(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...

Reynold Xin

Datacenter Computing with Apache Mesos - BigData DC

Paco Nathan

Architecting and productionising data science applications at scale

samthemonad

Next Generation of Hadoop MapReduce

huguk

Dori Exterman, Considerations for choosing the parallel computing strategy th...

Sergey Platonov

Introduction to Distributed Computing Engines for Data Processing - Simone Ro...

Data Science Milan

Machine Learning with Spark

elephantscale

Data Applications and Infrastructure at LinkedIn__HadoopSummit2010

Yahoo Developer Network

Is Spark the right choice for data analysis ?

Ahmed Kamal

Big Data and Hadoop in Cloud - Leveraging Amazon EMR

Vijay Rayapati

Distributed Computing & MapReduce

coolmirza143

Big Data! Great! Now What? #SymfonyCon 2014

Ricard Clau

Performance Management in ‘Big Data’ Applications

Michael Kopp

A look under the hood at Apache Spark's API and engine evolutions

Databricks

Role of python in hpc

Dr Reeja S R

Designing for the Cloud Tutorial - QCon SF 2009

Stuart Charlton

Introduction to Hadoop and Big Data

Joe Alex

Map reducecloudtech

Jakir Hossain

Ähnlich wie Functional Ideas for a Cloudy Future (20)

Spark

SQL or NoSQL, that is the question!

(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...

Datacenter Computing with Apache Mesos - BigData DC

Architecting and productionising data science applications at scale

Next Generation of Hadoop MapReduce

Dori Exterman, Considerations for choosing the parallel computing strategy th...

Introduction to Distributed Computing Engines for Data Processing - Simone Ro...

Machine Learning with Spark

Data Applications and Infrastructure at LinkedIn__HadoopSummit2010

Is Spark the right choice for data analysis ?

Big Data and Hadoop in Cloud - Leveraging Amazon EMR

Distributed Computing & MapReduce

Big Data! Great! Now What? #SymfonyCon 2014

Performance Management in ‘Big Data’ Applications

A look under the hood at Apache Spark's API and engine evolutions

Role of python in hpc

Designing for the Cloud Tutorial - QCon SF 2009

Introduction to Hadoop and Big Data

Map reducecloudtech

Mehr von Richard Minerich

Traditional approaches in anti-money laundering involve simple matching algorithms and a lot of human review. However, in recent years this approach has proven to not scale well with the ever increasingly strict regulatory environment. We at Bayard Rock have had much success at applying fancier approaches, including some machine learning, to this problem. In this talk I walk you through the general problem domain and talk about some of the algorithms we use. I’ll also dip into why and how we leverage typed functional programming for rapid iteration with a small team in order to out-innovate our competitors.

How we use functional programming to find the bad guys @ Build Stuff LT and U...

Richard Minerich

GHCi: More Awesome Than You Thought

Richard Minerich

F# and the DLR

Richard Minerich

Fun and Games in F#

Richard Minerich

Getting the MVVM Kicked Out of Your F#'n Monads

Richard Minerich

How you can get started with F# today

Richard Minerich

Mehr von Richard Minerich (6)

How we use functional programming to find the bad guys @ Build Stuff LT and U...

GHCi: More Awesome Than You Thought

F# and the DLR

Fun and Games in F#

Getting the MVVM Kicked Out of Your F#'n Monads

How you can get started with F# today

Kürzlich hochgeladen

Real Time Object Detection Using Open CV

Khem

Webinar Recording: https://www.panagenda.com/webinars/why-teams-call-analytics-is-critical-to-your-entire-business Nothing is as frustrating and noticeable as being in an important call and being unable to see or hear the other person. Not surprising then, that issues with Teams calls are among the most common problems users call their helpdesk for. Having in depth insight into everything relevant going on at the user’s device, local network, ISP and Microsoft itself during the call is crucial for good Microsoft Teams Call quality support. To ensure a quick and adequate solution and to ensure your users get the most out of their Microsoft 365. But did you know that ‘bad calls’ are also an excellent indicator of other problems arising? Precisely because it is so noticeable!? Like the canary in the mine, bad calls can be early indicators of problems. Problems that might otherwise not have been noticed for a while but can have a big impact on productivity and satisfaction. Join this session by Christoph Adler to learn how true Microsoft Teams call quality analytics helped other organizations troubleshoot bad calls and identify and fix problems that impacted Teams calls or the use of Microsoft365 in general. See what it can do to keep your users happy and productive! In this session we will cover - Why CQD data alone is not enough to troubleshoot call problems - The importance of attributing call problems to the right call participant - What call quality analytics can do to help you quickly find, fix-, and prevent problems - Why having retrospective detailed insights matters - Real life examples of how others have used Microsoft Teams call quality monitoring to problem shoot problems with their ISP, network, device health and more.

Why Teams call analytics are critical to your entire business

panagenda

In this session, we will delve into strategic approaches for optimizing knowledge management within Microsoft 365, amidst the evolving landscape of Copilot. From leveraging automatic metadata classification and permission governance with SharePoint Premium, to unlocking Viva Engage for the cultivation of knowledge and communities, you will gain actionable insights to bolster your organization's knowledge-sharing initiatives. In this session, we will also explore how to facilitate solutions to enable your employees to find answers and expertise within Microsoft 365. You will leave equipped with practical techniques and a deeper understanding of how there is more to effective knowledge management than just enabling Copilot, but building actual solutions to prepare the knowledge that Copilot and your employees can use.

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Drew Madelung

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

Zilliz

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Juan lago vázquez

The value of a flexible API Management solution for Open Banking Steve Melan, Manager for IT Innovation and Architecture - State's and Saving's Bank of Luxembourg Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The value of a flexible API Management solution for O...

apidays

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

Accelerating FinTech Innovation: Unleashing API Economy and GenAI Vasa Krishnan, Chief Technology Officer - FinResults Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

apidays

Modernizing Securities Finance: The cloud-native prime brokerage platform transforming capital markets. Madhu Subbu, Managing Director, Head of Securities Finance Engineering Apidays Singapore 2024: Connecting Customers, Business and Technology (April 17 & 18, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu

apidays

ICT role in 21st century education and its challenges

rafiqahmad00786416

💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows. We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases. This video focuses on the deployment of external web forms using Jotform for Bonterra Impact Management. This solution can be customized to your organization’s needs and deployed to support the common use cases below: - Intake and consent - Assessments - Surveys - Applications - Program registration Interested in deploying web form automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Jeffrey Haguewood

MySQL Webinar, presented on the 25th of April, 2024. Summary: MySQL solutions enable the deployment of diverse Database Architectures tailored to specific needs, including High Availability, Disaster Recovery, and Read Scale-Out. With MySQL Shell's AdminAPI, administrators can seamlessly set up, manage, and monitor these solutions, ensuring efficiency and ease of use in their administration. MySQL Router, on the other hand, provides transparent routing from the application traffic to the backend servers in the architectures, requiring minimal configuration. Completely built in-house and supported by Oracle, these solutions have been adopted by enterprises of all sizes for their business-critical applications. In this presentation, we'll delve into various database architecture solutions to help you choose the right one based on your business requirements. Focusing on technical details and the latest features to maximize the potential of these solutions.

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Miguel Araújo

Scalable LLM APIs for AI and Generative AI Application Development Ettikan Karuppiah, Director/Technologist - NVIDIA Apidays Singapore 2024: Connecting Customers, Business and Technology (April 17 & 18, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...

apidays

Abhishek Deb(1), Mr Abdul Kalam(2) M. Des (UX) , School of Design, DIT University , Dehradun. This paper explores the future potential of AI-enabled smartphone processors, aiming to investigate the advancements, capabilities, and implications of integrating artificial intelligence (AI) into smartphone technology. The research study goals consist of evaluating the development of AI in mobile phone processors, analyzing the existing state as well as abilities of AI-enabled cpus determining future patterns as well as chances together with reviewing obstacles as well as factors to consider for more growth.

Exploring the Future Potential of AI-Enabled Smartphone Processors

debabhi2

Axa Assurance Maroc - Insurer Innovation Award 2024

The Digital Insurer

A Beginners Guide to Building a RAG App Using Open Source Milvus

Zilliz

Following the popularity of "Cloud Revolution: Exploring the New Wave of Serverless Spatial Data," we're thrilled to announce this much-anticipated encore webinar. In this sequel, we'll dive deeper into the Cloud-Native realm by uncovering practical applications and FME support for these new formats, including COGs, COPC, FlatGeoBuf, GeoParquet, STAC, and ZARR. Building on the foundation laid by industry leaders Michelle Roby of Radiant Earth and Chris Holmes of Planet in the first webinar, this second part offers an in-depth look at the real-world application and behind-the-scenes dynamics of these cutting-edge formats. We will spotlight specific use-cases and workflows, showcasing their efficiency and relevance in practical scenarios. Discover the vast possibilities each format holds, highlighted through detailed discussions and demonstrations. Our expert speakers will dissect the key aspects and provide critical takeaways for effective use, ensuring attendees leave with a thorough understanding of how to apply these formats in their own projects. Elevate your understanding of how FME supports these cutting-edge technologies, enhancing your ability to manage, share, and analyze spatial data. Whether you're building on knowledge from our initial session or are new to the serverless spatial data landscape, this webinar is your gateway to mastering cloud-native formats in your workflows.

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Safe Software

Artificial Intelligence Chap.5 : Uncertainty

Khushali Kathiriya

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

MadyBayot

Kürzlich hochgeladen (20)

Real Time Object Detection Using Open CV

Why Teams call analytics are critical to your entire business

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Apidays New York 2024 - The value of a flexible API Management solution for O...

How to Troubleshoot Apps for the Modern Connected Worker

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu

ICT role in 21st century education and its challenges

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...

Exploring the Future Potential of AI-Enabled Smartphone Processors

Axa Assurance Maroc - Insurer Innovation Award 2024

A Beginners Guide to Building a RAG App Using Open Source Milvus

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Artificial Intelligence Chap.5 : Uncertainty

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

Functional Ideas for a Cloudy Future

1. Functional Ideas for a Cloudy Future Richard Minerich @Rickasaurus Senior Researcher at Bayard Rock

3. Functional Programming? + =

4. Properties of FP? (It Depends on Who You Ask) - First Class Functions - Currying, Composition, Combinators - Low Level Abstraction, Metaprogramming - Immutability, Fancy Types, Constraints - Fast Tail Recursion, Scope Minimization

5. The Spectrum of Functional Convenience “FP” Constraints Make Life Easy Now Make Life Easy Later

6. Referential Transparency - It's all about scope! - Mutation only infects in so far as it’s scope - Global variables can be ok, if your referential transparency scope is a process - This can be function, class, thread, process, or even a whole computer

7. What is Functional Programming? - Complementary convenience and constraints - A highly constrained set of approaches to programing - Where you lose in order to gain - Low level constraints that propagate upwards to the top level of your program

8. Program Scope over my Career • Largest scope was usually a process with one thread • Then a process with a few threads • Then a process with many threads • Then a few machines • Now a ton of machines

10.

11. We need to scale out • Desktop apps are going away • Hosted hardware is on the way out • No one cares about little data - But! - • Old algorithms don’t generalize well • New tradeoffs between speed and scope • Too many costs to keep track of

12. Thinking about Resource Costs Far Machines Far Network Machines Network Processes Disk Threads Memory Instructions Cache

13. What is Cloud Computing? - More than just a sneaky way to charge a ton for hosting - Paradigms that simply resource management - You always lose in order to gain - High level constraints that propagate downward into your subtasks

14. Papers Published Over Time (Microsoft Academic Search April 2012) “Cloud Computing” “Type System”

15. Properties of Cloud Computing - Resources (Network, Disk, Memory, Cache) - What constraints can make this easier? - Force everything into one of a few styles of computation? - What if want we want to do is still possible but doesn't fit our cluster’s paradigm? - Where's the escape hatch?

16. Cloud Computing Methodologies (Warning, Gross oversimplifications ahead!) - MPI (Fixed Processes) (OpenMPI, Tempest) - Agents (Dynamic Processes) (Erlang, Parallel Haskell, Akka) - MapReduce (More Like Collect-GroupBy-Fold) (Hadoop, Google) - Others/Hybrids (Iterative Map-Reduce, Mesos/Spark)

17. From: Flexible and Efficient Distributed Resolution of Large Entities (Molnar, et al.)

18. This is Word Count. Seriously. 63 Lines!

19. Hadoop (without losing your mind) - Pig/Hive if your problem is simple - Scoobi with Scala - Scalding (on Cascading) on Scala - F# + .NET API once Microsoft Ships

20. Scoobi

21. Scalding

22. From the Pangool Website: http://pangool.net/benchmark.html

23. Which to Pick? MPI/Agents = Difficult to get right, Extremely Powerful MapReduce = Limiting, Easier to use, Robust to failures Middle Road = Iterative MapReduce, Mesos/Spark

24. Mesos: You don’t have to choose Spark

25.

26. This is Word Count. Seriously!?

27. 10 Lines. That’s 53 Less or ~15%

28. Cloud Computing is Functional Programming - Can’t Escape Referential Transparency - Simple Composition is Key to Small Programs - Object Oriented: a Square Peg in a Round Hole

29. Thanks for Listening! Any Questions? Visit my blog for ants and rants: RichardMinerich.com Follow me on Twitter: @Rickasaurus Come to NYC for the SkillsMatter F# Tutorials June 5th and 6Th: is.gd/fsharptutorials

Hinweis der Redaktion

We do research and development on anti-money laundering and some of the largest banks in the world use products we made every day.
How many of you loved legos when you were kids? Isn’t this what we really want programming to be like? A big pile of little parts, it’s obvious how they work but we don’t want to have to make them all from scratch.I loved legos as a child. They’re simple and intuitive but you can compose them into amazing things.If you had to make the legos yourself from smaller pieces they would be tedious. But big blocks like duplos severely constrain your imagination. Legos sit at just the right level of abstraction.Learning to program in BASIC was more of a challenge though, it ended up being more like weaving and less like building with legos. Tragically, I didn’t discover functional programming until I was almost 30.
At one time I was an imperative guy who wrote imagine processing code in C++ and C#. Those were hard times, full of null exceptions and race conditions. It could take several months to make a product releasable with a sizable team.Then I went to a talk by Rick Hickey and he showed me justhow awesome FP can be.After I learned functional programming in F# my productivity Skyrocketed, bugs disappeared, I was able to make much cooler stuff in a much shorter time. I even had more time because the old stuff required less maintenance. - More than quadrupled my productivityNow I spend all that newly found free time giving talks and arguing with object oriented programmers on the internet .
Here’s where it gets tricky. Just what is functional programming? Well it depends on who you ask. Python users will tell you it’s something that comes in a module, while Haskell programmers will tell you that just about everyone else is faking it. Really, it’s more of a spectrum where as you get more and more functional you gain more and more benefits but also have to give up some things along the way.
\\FP is the intersection of some set of convenience features and some set of constraintsThe more convenience features you have, like nice tuple syntax or comprehensions the easier stuff is to get done fastThe most constraint features you have, like immutability and fancy type, the easier it is to revisit and refactor laterThis isn’t purely true though, for example you may find you need to write fewer tests with fancy types.The way I see it, the more your conveniences share with your constraints the more “functional” your language is.As you squish these two Venn diagrams together more and more of the features on either side complement each other. The constraint features keep you protected from the convenience features getting out of hand.The convenience feature keep the constraints from slowing down your productivity. Think of Python as when there’s just a tiny bit of overlap and Haskell as when they’re almost completely overlapping. Everyone else lands somewhere in the middle. F# and Haskell closer to squished. C#, Javascript and Ruby a bit further out.
Now I could spend days telling you all about functional programming, but there’s one idea in FP that I would say is the most important. That idea is referential transparency.All referential transparency means is that from here I can understand what all the stuff in scope does. It’s all deterministic. For a given input, you’ll always get the same output (unless there’s something like hardware failure).The most interesting thing about referential transparency is that it doesn’t need to hold for your entire program to hold for most of it. You can write that algorithm you know is fast imperatively in that style and if you wrap it intelligently it’s just useful from the outside as if it was done with pure functional programming.But you do lose some confidence about the properties of that function.
As a fuzzy definition – complementary set of convenience features and constraintsFor example, if you know how all the code underneath you works it makes it much easier to ensure safety at a higher level.Constraints make it much easier to think about what your program is doing.
We’re all being dragged along, some faster than others depending on the kind of work we do.In fact, some older programming languages like Python and OCaml were effectively crippled by decisions they made years ago when it looked like we could scale on one CPU forever. In both languages the root of this problem is called “The giant lock”. I hope as we move to even larger scopes those locks become somewhat irrelevant as the cost of having many processes becomes dwarfed by other factors.
Source: http://www.indybay.org/newsitems/2006/05/18/18240941.phpThis is the mandatory Moore's law slide before talking about cloud computing. All aspects of computing will eventually end up looking like the graph on the right. Strange that computation peaked out way before storage and memory, but that’s just how it ended up. Now we’re left with having to find interesting ways to deal with it.
(Raise hands if you’ve seen this slide before)However, Something like Moore’s Law still lives, for now. As you’ve probably heard we’re still gaining ground on the power efficiency front. We can scale out instead of up.
You really can’t escape it, tablets are just the beginning and desktop computers as we know them are on the way out for most people. The unfortunate part of this whole deal is that we can’t apply most of our work directly in any of the scaled out models. So we’re stuck in a world that boldly marches on, dragging us kicking and screaming into a much harder way of doing things. There’s a lot of new things to consider in this brave new world beyond clock cycles.
This is a complete conceptual hazard. It’s really hard to keep all of this in your head at once and still come up with solutions to interesting problems. Each of these things have different sub properties to consider in different situations as well. For example, sometimes just network bandwidth matters, sometimes latency, sometimes both.Getting things done in the global network, we’re going to need a way to reason about these things without having to keep them all in our head at once.
At first, I was convinced that cloud hosting in general was pretty much a big scam, but as the data has grown bigger I’ve seen the error of my ways. Often you don’t need these computers for very long. Maintaining the cluster For many problems you can get near linear scaling, so it’s pretty awesome to be able to fire up a ton of instances, in general this costs about the same as using fewer computers They provide great frameworks and tools for thinking about these kinds of problems that reduce the need to worry about resources so much. They allow you to think in more general terms (like O notation) With each methodology you take on some communication constraints in order to make problems easier to think about While type systems are like constraining a floor that you can’t fall through, cloud computing methodologies in general are more like a ceiling that dictate how parts of your program are combined
In the past three years the number of papers published on algorithms in the cloud has skyrocketed.Rich Hickey – working on datanomicsSimon Payton-Jones – working on parallel haskellWhy? Because it’s hugely useful for solving hard problems and computers just aren’t getting much faster. And there’s the fact that the amount of data lying around is skyrocketing as our storage capabilities continue to increase. What are we going to do with all that data?
MapReduce - You can only do two things, and in this order MPI/Agents – more about how you communicate Generalizations of mapreduceSo what do we do? We force you into one of several choices of computing methodology. Each has different constraints. With multi-paradigm problems you can often fake it for smaller data sets, but as they grow it becomes more and more important to be flexible. - Finally, unlike functional programming, there is no magic escape hatch. Calling a C library is no longer the answer to all of life’s performance problems.
Whirlwind tour!When you first get the cloud computing bug it can all be a bit overwhelming. There’s just a ton of frameworks which each sport very nice benchmarks for the things they choose to benchmark on. Some are mature and others are small projects. They all have limitations and cohorts of enthusiastic followers. You may be familiar with some of these but we’re just going to focus on a few.MPI – Centrally controlled Agents – Can launch each otherMapReduce – Very constraining but people always break the rulesIterative Map-Reduce (Academic/unpolished)Spark – I see as the future of cloud computing, constraining but not so much that you must constantly break the rules
This research was done with 14 off the shelf computers put together by college students in Hungary. It’s one of the first examples on entity resolution on data that can handle what exists right now in the real world.In my business large scale entity resolution with any kind of guarantees seemed like a pipe dream. This paper changed my whole perspective. Sure, we’re measuring time in hours, but if your task is already taking hours on one computer what does it matter.
- Word Count in MapReduce with Java, Pretty much the “Hello Word” of the MapReduce paradigm. This hurts to even look at.Just when I thought I had escaped the tedium of object oriented programming, here I was trying to use a paradigm that fits functional programming like a glove and yet I was reduced to writing pages of code to do a simple word count. To make matters even worse, I had lost my beautiful Visual Studio tooling. The friction was just unbelievable. The thought of how many lines of code a real entity resolution system would be made me a bit queasy to say the least.- Note iteration over elements. Is this really necessary when we can have higher level abstractions?!
- Map -> Choose (Open Map) – one to many- Partition -> Sort and group by key Reduce -> Constrained Reduce – many to many (or fewer)Now, don’t get me wrong. There’s are reasons why many of the largest software companies (including IBM and Microsoft) are embracing Hadoop. It’s big, it schedules well, it’s pretty darn fast and, most importantly it’s mature.
Very simple and clean, you say what you want, not how you want it done
Very similar to Scoobi, although based on Cascading. For some reason the Scalding folks love to use a lot of type annotations.
Both of these programs are doing almost exactly what that java code before was except they are composing little functional subprograms instead of trying to do it all by hand.Actually, scalding is doing a bit more work in this code, as it lowercases and removes punctuation. Otherwise they’re actually quite similar and fairly beautiful to look at.Against the pages of Java it took to do Word Count before this is a god send.I do think the Scoobi looks a bit nicer because it’s a bit less verbose. That’s a matter of style and comfort with the language and less about the frameworks though.
There are just a ton of Hadoop toolkits, but let’s focus on the blue and green ones .Seeing as how Scalding is built on top of cascading, it seems like a poor choice for a small company with limited resources.Here we run into a bit of a problem though. On one hand we have Scalding, a big project by the folks at twitter. On the other we have Scoobi made by OpenNICTAwhich is a small institution with a bend toward scientific computing.Pangool is a slightly less horrible API for JavaScrunch for Crunch
ButMapReduce is just one of many choices for cloud computing paradigms. When you go solve a difficult problem with the cloud your choice should depend on a ton of factors. Can you accomplish what you’re looking to do? What technologies are you comfortable with? Are you comfortable using research software? Most importantly, can I get this done without talking to anyone from IT?
For a smaller company with limited resources like mine, Mesos is quite significant. It allows you to build one cluster and perform many different styles of computation all sharing the same scheduler.Notice that Spark writes a lot like Scoobi, but without all of the ceremony. It also loosens the straps of the standard Map-Reduce straight jacket a bit by allowing you to keep things in memory between iterations.
For a lot of difficult problems Spark is hugely better, but Hadoop has better tooling and a lot of people using it. With Mesos you get the best of both worlds. It’s a recent discovery for me, but I’m already a huge fan.
Just to come back to this for a minute, look at this code and imagine it was your future. The thought of this for myself almost brought me to tears.While the giants of tech like Microsoft and IBM have been asleep at the wheel, functional programmers have been busy solving the hard problems. There’s absolutely no reason to go back to this kind of nightmare, you’d have to be certifiably insane.
Now imagine this, you split the lines, add a number to the word, combine them up and then add them. It reads almost like English.
Not just the ideas from functional programming, So, just as predicted years ago, functional programming has come to dominate at least one aspect of modern computing. Even in using java you are forced to write small programs which are referentially transparent and operate in parallel. But with functional programming you can have small composable referentially transparent parts with hidden implementation you don’t have to care about.
We’ll have some of the Cloud Numericsguys there giving a tutorialhow to do linear algebra in the cloud.

Functional Ideas for a Cloudy Future

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Functional Ideas for a Cloudy Future

Ähnlich wie Functional Ideas for a Cloudy Future (20)

Mehr von Richard Minerich

Mehr von Richard Minerich (6)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Functional Ideas for a Cloudy Future

Hinweis der Redaktion