SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Welcome to
Technical Data Infrastructure Frameworks
Archonnex @ ICPSR
Data Science Management For All
Harsha Ummerpillai, Architect / Software Lead
Tom Murphy, Director of Computing and Network Services
About ICPSR
Mission
ICPSR advances and expands social and behavioral research, acting as a
global leader in data stewardship and providing rich data resources and
responsive educational opportunities.
About
An international consortium of more than 700 academic institutions and
research organizations, ICPSR provides leadership and training in data
access, curation, and methods of analysis for the social science research
community.
ICPSR maintains a data archive of more than 500,000 files of research in
the social sciences. It hosts 16 specialized collections of data in
education, aging, criminal justice, substance abuse, terrorism and other
areas of social research.
Introduction
Archonnex is a Digital Asset Management Systems (DAMS)
architecture defined to transition ICPSR to a newer
technology stack meeting core and emerging business needs
of the organization. It aims to build a digital technology
platform that leverages ICPSR expertise and open source
technologies that are proven and well supported by Open
Source communities.
Guiding Principles
 Comprehensive Digital Asset Management Platform.
 Open Archival Information Systems (OAIS model) compliant.
 Multi-tenancy. ICPSR needs to support multiple archives and agencies.
 Secure. Privacy and Security are primary concerns for social research data.
 Service Oriented and Modular.
 Scalable; Ability to handle large datasets and peak activity spikes.
 Open Source technologies with good community engagement;
 Enable standards based metadata harvesting and data exports.
 Cohesive technology choices.
 Flexible UI components that can be re-used and enables faster development.
Message based Integration
 Apache ActiveMQ is the messaging server.
 Apache Camel provides a simplified implementation of most common
Enterprise Integration patterns.
 Figure: High level view of Camel's architecture (from Camel in Action).
Infrastrucuture
Repository
Engine
Virus Scanner
Message
Queuing
System
RDBMS
Cloud Storage
Deposit
Manager
Single Sign On
Open API
Data Analysis
Engine
Web / Use
Analytics
Search
Engine
Metadata
Manager
Geo
Tagger
Admin UI Search
Web UI
Components
NAS Storage
Researcher
Interface
Agent
Batch Jobs
Subscription
Manager
Payment
Portal
Reports
Alerts
Image
Processor
Workflow
BPM
Infrastrucuture
Fedora
Virus Scanner
ActiveMQ
Oracle
AWS S3
Deposit
Manager
SSO
Open API
Elastic Search
Kibana
Solr
Fuseki
Geo
Tagger
Admin
Search
Web UI
Components
isilon
Sead
Agent
Batch Jobs
Subscription
Manager
Payment
Portal
Reports
Alerts
Image
Processor
Preservation
Manager
Activiti
Multi-tenancy
All service components support multi tenants.
Supports tenant specific configuration & preferences.
Web aspects of service components are embeddable
within respective tenant Web apps
Workspace Manager and Search Manager are two
examples of UI Plugins that are embeddable.
Single Sign On & ID Management
Central Authentication
and Identification System.
Supports ORCID and social IDs Google,
Facebook & LinkedIn.
Authorization management will support role
based access controls
Deposit Manager
Supports (SIP) data ingest & storage,
coordinate virus scanning,
statistical file validation, variable extraction,
image processing &
metadata extraction.
Easy to use Workspace & file management
Ability to publish at granular level.
Embeddable UI Plugin supporting
tenant specific configuration
Supported protocols for ingest, HTTP, SFTP and Email
Integrates with BPM/Workflow Engine
Preservation Manager
Implements transcoding of data specific to MIME types
Generates Archival Information Package (AIP) &
Dissemination Information Package (DIP).
Replicates AIP to storages for long term preservation.
Performs Fixity checks periodically.
Search Manager
Full featured text search using
Apache Solr.
Embeddable UI Plugin supporting
tenant specific configuration
Coverage includes but not limited to Keywords, Metadata,
Text and Geospatial fields.
Exploring GeoBlacklight for search and dissemination
of geographical data.
Anti-Virus Scanner
Anti virus scanning as a service.
Supports ClamAV and Sophos.
Capable of expanding to
multiple nodes allowing horizontal scalability
to support scanning large data sets.
SPSS Processor
Performs additional processing of IBM SPSS files.
Analyze and report potential missing
variables and inconsistencies.
Extract variables and store
for online analysis tools.
Open API
RESTful services to
enable metadata harvesting and exports
using industry wide standards and formats.
Example: RDF, JSON-LD, DDI XML…
Workflow Engine
Central workflow management providing unified
action list for users.
Ability to model business process flows and
Integrate with technical components.
Activiti is the chosen technology.
Reports & Analytics
Captures all system & user activities within the components
enabling effective provenance data collection.
Central consolidated storage for all the logs.
Ability to discover, visualize and report on data collected.
ElasticSearch & Kibana
Google Analytics (Client & Server side)
Content Specific Processors
Add on modules that can derive and extract
custom attributes. These modules can be
invoked using messages and added to the processing
pipelines.
For example image files can produce
thumbnails for easier display on GUI.
Image Processor module performs this function.
Geospatial data published to an
Apache Geoserver.
Geo Tagger
Add on module that can derive and process
geographical information from inputs
like street address, IP address, shapes on a map
or markers on a map.
Will generate geo tag information for display
and support search capabilities.
Research Data Integration
Ability to integrate with external data producers.
For example SEAD, OSF…
WEB UI
HTML
Javascript
CSS
Twitter Bootstrap
Jquery and Plugins
Facebook ReactJS
Advanced REST Client
Protocol
Https
Https/REST/JSON/JSON-LD
Https/SOAP/XML *
Middleware
J2EE Application servers (Jetty,
Apache Tomcat)
Spring MVC
Groovy/Grails
Ruby/Rails
Desktop UI
Java Swings
Java Web Start
Batch Automation UI
UC4 *
Control M *
Protocol
Https
Https/REST/JSON
Java Network Launch Protocol (JNLP)
Middleware
J2EE Application servers (Jetty,
Apache Tomcat)
Spring MVC
Spring Remoting & Web Services
Protocol
SSH
Java RMI
Scripting/Orchestration
Shell Programming
Perl
Ruby
Groovy
Storage/Databases
Network File Storage
Oracle/MySQL/PostgreSQL
Amazon Cloud
Duraspace Cloud
ESB/Message Brokers
Apache Active MQ
Rabbit MQ
Apache Camel
Source Code Management
Git
CVS *
Productivity Tools
Drupal/Confluence/Google Sites
JIRA
Bamboo
Fisheye
Crucible
Stash
Microsoft Office
Operating Systems
Servers (Linux)
Desktop (Linux/Windows/Mac)
Build Tools
Ant
Maven
openICPSR scheduled to be
released by end of Jul/2016 on
new Archonnex platform.
Questions?
Thank you
Thomas Murphy
tomurphy@umich.edu
CNS Director
Harsha Ummerpillai
harshau@umich.edu
Software Architect

Weitere ähnliche Inhalte

Was ist angesagt?

Graph-based Product Lifecycle Management
Graph-based Product Lifecycle ManagementGraph-based Product Lifecycle Management
Graph-based Product Lifecycle ManagementLinkurious
 
A LASSO for Linked Data
A LASSO for Linked DataA LASSO for Linked Data
A LASSO for Linked Datathosch
 
Combining efficiency, fidelity, and flexibility in resource information services
Combining efficiency, fidelity, and flexibility in resource information servicesCombining efficiency, fidelity, and flexibility in resource information services
Combining efficiency, fidelity, and flexibility in resource information servicesCloudTechnologies
 
COMBINING EFFICIENCY, FIDELITY, AND FLEXIBILITY IN RESOURCE INFORMATION SERV...
 COMBINING EFFICIENCY, FIDELITY, AND FLEXIBILITY IN RESOURCE INFORMATION SERV... COMBINING EFFICIENCY, FIDELITY, AND FLEXIBILITY IN RESOURCE INFORMATION SERV...
COMBINING EFFICIENCY, FIDELITY, AND FLEXIBILITY IN RESOURCE INFORMATION SERV...Nexgen Technology
 
Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Abhimanyu Singhal
 
Data Federation/EII Uses And Abuses
Data Federation/EII Uses And AbusesData Federation/EII Uses And Abuses
Data Federation/EII Uses And Abusesmark madsen
 
OSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications databaseOSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications databaseOpen Science Fair
 
Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310Mark Tabladillo
 
Enterprise Information Integration at LondonMet
Enterprise Information Integration at LondonMetEnterprise Information Integration at LondonMet
Enterprise Information Integration at LondonMetPaul Walk
 
IEEE 2015 - 2016 | Combining Efficiency, Fidelity, and Flexibility in Resource...
IEEE 2015 - 2016 | Combining Efficiency, Fidelity, and Flexibility in Resource...IEEE 2015 - 2016 | Combining Efficiency, Fidelity, and Flexibility in Resource...
IEEE 2015 - 2016 | Combining Efficiency, Fidelity, and Flexibility in Resource...1crore projects
 
Skill up in machine learning using Azure ML
Skill up in machine learning using Azure MLSkill up in machine learning using Azure ML
Skill up in machine learning using Azure MLMostafa
 
GraphTech Ecosystem - part 1: Graph Databases
GraphTech Ecosystem - part 1: Graph DatabasesGraphTech Ecosystem - part 1: Graph Databases
GraphTech Ecosystem - part 1: Graph DatabasesLinkurious
 
Building big data solutions on azure
Building big data solutions on azureBuilding big data solutions on azure
Building big data solutions on azureEyal Ben Ivri
 

Was ist angesagt? (18)

Graph-based Product Lifecycle Management
Graph-based Product Lifecycle ManagementGraph-based Product Lifecycle Management
Graph-based Product Lifecycle Management
 
A LASSO for Linked Data
A LASSO for Linked DataA LASSO for Linked Data
A LASSO for Linked Data
 
Combining efficiency, fidelity, and flexibility in resource information services
Combining efficiency, fidelity, and flexibility in resource information servicesCombining efficiency, fidelity, and flexibility in resource information services
Combining efficiency, fidelity, and flexibility in resource information services
 
Enterprise Information Integration
Enterprise Information IntegrationEnterprise Information Integration
Enterprise Information Integration
 
COMBINING EFFICIENCY, FIDELITY, AND FLEXIBILITY IN RESOURCE INFORMATION SERV...
 COMBINING EFFICIENCY, FIDELITY, AND FLEXIBILITY IN RESOURCE INFORMATION SERV... COMBINING EFFICIENCY, FIDELITY, AND FLEXIBILITY IN RESOURCE INFORMATION SERV...
COMBINING EFFICIENCY, FIDELITY, AND FLEXIBILITY IN RESOURCE INFORMATION SERV...
 
Big Data Landscape 2016
Big Data Landscape 2016Big Data Landscape 2016
Big Data Landscape 2016
 
Metadata
MetadataMetadata
Metadata
 
Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure
 
The Social Data Web
The Social Data WebThe Social Data Web
The Social Data Web
 
Data Federation/EII Uses And Abuses
Data Federation/EII Uses And AbusesData Federation/EII Uses And Abuses
Data Federation/EII Uses And Abuses
 
OSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications databaseOSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications database
 
Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310
 
Enterprise Information Integration at LondonMet
Enterprise Information Integration at LondonMetEnterprise Information Integration at LondonMet
Enterprise Information Integration at LondonMet
 
IEEE 2015 - 2016 | Combining Efficiency, Fidelity, and Flexibility in Resource...
IEEE 2015 - 2016 | Combining Efficiency, Fidelity, and Flexibility in Resource...IEEE 2015 - 2016 | Combining Efficiency, Fidelity, and Flexibility in Resource...
IEEE 2015 - 2016 | Combining Efficiency, Fidelity, and Flexibility in Resource...
 
Skill up in machine learning using Azure ML
Skill up in machine learning using Azure MLSkill up in machine learning using Azure ML
Skill up in machine learning using Azure ML
 
GraphTech Ecosystem - part 1: Graph Databases
GraphTech Ecosystem - part 1: Graph DatabasesGraphTech Ecosystem - part 1: Graph Databases
GraphTech Ecosystem - part 1: Graph Databases
 
Building big data solutions on azure
Building big data solutions on azureBuilding big data solutions on azure
Building big data solutions on azure
 
Applications of semantic web
Applications of semantic webApplications of semantic web
Applications of semantic web
 

Ähnlich wie Archonnex at ICPSR

ColbyBackesPortfolio_HighRes
ColbyBackesPortfolio_HighResColbyBackesPortfolio_HighRes
ColbyBackesPortfolio_HighResColby Backes
 
Information management
Information managementInformation management
Information managementDavid Champeau
 
Cloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionCloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionDenodo
 
AIOps: Anomalous Span Detection in Distributed Traces Using Deep Learning
AIOps: Anomalous Span Detection in Distributed Traces Using Deep LearningAIOps: Anomalous Span Detection in Distributed Traces Using Deep Learning
AIOps: Anomalous Span Detection in Distributed Traces Using Deep LearningJorge Cardoso
 
Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...
Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...
Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...Rahul Neel Mani
 
Technology Overview
Technology OverviewTechnology Overview
Technology OverviewLiran Zelkha
 
Gurney · SlidesCarnival.pptx
Gurney · SlidesCarnival.pptxGurney · SlidesCarnival.pptx
Gurney · SlidesCarnival.pptxyakotalordea
 
Comprehensive Guide for Microsoft Fabric to Master Data Analytics
Comprehensive Guide for Microsoft Fabric to Master Data AnalyticsComprehensive Guide for Microsoft Fabric to Master Data Analytics
Comprehensive Guide for Microsoft Fabric to Master Data AnalyticsSparity1
 
APIsecure 2023 - API orchestration: to build resilient applications, Cherish ...
APIsecure 2023 - API orchestration: to build resilient applications, Cherish ...APIsecure 2023 - API orchestration: to build resilient applications, Cherish ...
APIsecure 2023 - API orchestration: to build resilient applications, Cherish ...apidays
 
Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS
Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICSBig Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS
Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICSMatt Stubbs
 
Steering Away from Bolted-On Analytics
Steering Away from Bolted-On AnalyticsSteering Away from Bolted-On Analytics
Steering Away from Bolted-On AnalyticsConnexica
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Riccardo Zamana
 
Cloud is the new normal - Red Hat Forum Bangalore 2015
Cloud is the new normal - Red Hat Forum Bangalore 2015Cloud is the new normal - Red Hat Forum Bangalore 2015
Cloud is the new normal - Red Hat Forum Bangalore 2015Red Hat India Pvt. Ltd.
 
Financial Services Analytics on AWS
Financial Services Analytics on AWSFinancial Services Analytics on AWS
Financial Services Analytics on AWSAmazon Web Services
 
Stream analytics
Stream analyticsStream analytics
Stream analyticsrebeccatho
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Amy W. Tang
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesASIS&T
 

Ähnlich wie Archonnex at ICPSR (20)

ColbyBackesPortfolio_HighRes
ColbyBackesPortfolio_HighResColbyBackesPortfolio_HighRes
ColbyBackesPortfolio_HighRes
 
Information management
Information managementInformation management
Information management
 
Cloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionCloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service Option
 
SAIP
SAIPSAIP
SAIP
 
AIOps: Anomalous Span Detection in Distributed Traces Using Deep Learning
AIOps: Anomalous Span Detection in Distributed Traces Using Deep LearningAIOps: Anomalous Span Detection in Distributed Traces Using Deep Learning
AIOps: Anomalous Span Detection in Distributed Traces Using Deep Learning
 
Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...
Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...
Key Imperatives for the CIO in Digital Age By Lalatendu Das Digital VP, Assoc...
 
Technology Overview
Technology OverviewTechnology Overview
Technology Overview
 
Gurney · SlidesCarnival.pptx
Gurney · SlidesCarnival.pptxGurney · SlidesCarnival.pptx
Gurney · SlidesCarnival.pptx
 
Comprehensive Guide for Microsoft Fabric to Master Data Analytics
Comprehensive Guide for Microsoft Fabric to Master Data AnalyticsComprehensive Guide for Microsoft Fabric to Master Data Analytics
Comprehensive Guide for Microsoft Fabric to Master Data Analytics
 
APIsecure 2023 - API orchestration: to build resilient applications, Cherish ...
APIsecure 2023 - API orchestration: to build resilient applications, Cherish ...APIsecure 2023 - API orchestration: to build resilient applications, Cherish ...
APIsecure 2023 - API orchestration: to build resilient applications, Cherish ...
 
Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS
Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICSBig Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS
Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS
 
Steering Away from Bolted-On Analytics
Steering Away from Bolted-On AnalyticsSteering Away from Bolted-On Analytics
Steering Away from Bolted-On Analytics
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020
 
Cloud is the new normal - Red Hat Forum Bangalore 2015
Cloud is the new normal - Red Hat Forum Bangalore 2015Cloud is the new normal - Red Hat Forum Bangalore 2015
Cloud is the new normal - Red Hat Forum Bangalore 2015
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Financial Services Analytics on AWS
Financial Services Analytics on AWSFinancial Services Analytics on AWS
Financial Services Analytics on AWS
 
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
 
Stream analytics
Stream analyticsStream analytics
Stream analytics
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
 

Kürzlich hochgeladen

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Kürzlich hochgeladen (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Archonnex at ICPSR

  • 1. Welcome to Technical Data Infrastructure Frameworks Archonnex @ ICPSR Data Science Management For All Harsha Ummerpillai, Architect / Software Lead Tom Murphy, Director of Computing and Network Services
  • 2. About ICPSR Mission ICPSR advances and expands social and behavioral research, acting as a global leader in data stewardship and providing rich data resources and responsive educational opportunities. About An international consortium of more than 700 academic institutions and research organizations, ICPSR provides leadership and training in data access, curation, and methods of analysis for the social science research community. ICPSR maintains a data archive of more than 500,000 files of research in the social sciences. It hosts 16 specialized collections of data in education, aging, criminal justice, substance abuse, terrorism and other areas of social research.
  • 3. Introduction Archonnex is a Digital Asset Management Systems (DAMS) architecture defined to transition ICPSR to a newer technology stack meeting core and emerging business needs of the organization. It aims to build a digital technology platform that leverages ICPSR expertise and open source technologies that are proven and well supported by Open Source communities.
  • 4. Guiding Principles  Comprehensive Digital Asset Management Platform.  Open Archival Information Systems (OAIS model) compliant.  Multi-tenancy. ICPSR needs to support multiple archives and agencies.  Secure. Privacy and Security are primary concerns for social research data.  Service Oriented and Modular.  Scalable; Ability to handle large datasets and peak activity spikes.  Open Source technologies with good community engagement;  Enable standards based metadata harvesting and data exports.  Cohesive technology choices.  Flexible UI components that can be re-used and enables faster development.
  • 5. Message based Integration  Apache ActiveMQ is the messaging server.  Apache Camel provides a simplified implementation of most common Enterprise Integration patterns.  Figure: High level view of Camel's architecture (from Camel in Action).
  • 6.
  • 7. Infrastrucuture Repository Engine Virus Scanner Message Queuing System RDBMS Cloud Storage Deposit Manager Single Sign On Open API Data Analysis Engine Web / Use Analytics Search Engine Metadata Manager Geo Tagger Admin UI Search Web UI Components NAS Storage Researcher Interface Agent Batch Jobs Subscription Manager Payment Portal Reports Alerts Image Processor Workflow BPM
  • 8. Infrastrucuture Fedora Virus Scanner ActiveMQ Oracle AWS S3 Deposit Manager SSO Open API Elastic Search Kibana Solr Fuseki Geo Tagger Admin Search Web UI Components isilon Sead Agent Batch Jobs Subscription Manager Payment Portal Reports Alerts Image Processor Preservation Manager Activiti
  • 9. Multi-tenancy All service components support multi tenants. Supports tenant specific configuration & preferences. Web aspects of service components are embeddable within respective tenant Web apps Workspace Manager and Search Manager are two examples of UI Plugins that are embeddable.
  • 10. Single Sign On & ID Management Central Authentication and Identification System. Supports ORCID and social IDs Google, Facebook & LinkedIn. Authorization management will support role based access controls
  • 11. Deposit Manager Supports (SIP) data ingest & storage, coordinate virus scanning, statistical file validation, variable extraction, image processing & metadata extraction. Easy to use Workspace & file management Ability to publish at granular level. Embeddable UI Plugin supporting tenant specific configuration Supported protocols for ingest, HTTP, SFTP and Email Integrates with BPM/Workflow Engine
  • 12. Preservation Manager Implements transcoding of data specific to MIME types Generates Archival Information Package (AIP) & Dissemination Information Package (DIP). Replicates AIP to storages for long term preservation. Performs Fixity checks periodically.
  • 13. Search Manager Full featured text search using Apache Solr. Embeddable UI Plugin supporting tenant specific configuration Coverage includes but not limited to Keywords, Metadata, Text and Geospatial fields. Exploring GeoBlacklight for search and dissemination of geographical data.
  • 14. Anti-Virus Scanner Anti virus scanning as a service. Supports ClamAV and Sophos. Capable of expanding to multiple nodes allowing horizontal scalability to support scanning large data sets.
  • 15. SPSS Processor Performs additional processing of IBM SPSS files. Analyze and report potential missing variables and inconsistencies. Extract variables and store for online analysis tools.
  • 16. Open API RESTful services to enable metadata harvesting and exports using industry wide standards and formats. Example: RDF, JSON-LD, DDI XML…
  • 17. Workflow Engine Central workflow management providing unified action list for users. Ability to model business process flows and Integrate with technical components. Activiti is the chosen technology.
  • 18. Reports & Analytics Captures all system & user activities within the components enabling effective provenance data collection. Central consolidated storage for all the logs. Ability to discover, visualize and report on data collected. ElasticSearch & Kibana Google Analytics (Client & Server side)
  • 19. Content Specific Processors Add on modules that can derive and extract custom attributes. These modules can be invoked using messages and added to the processing pipelines. For example image files can produce thumbnails for easier display on GUI. Image Processor module performs this function. Geospatial data published to an Apache Geoserver.
  • 20. Geo Tagger Add on module that can derive and process geographical information from inputs like street address, IP address, shapes on a map or markers on a map. Will generate geo tag information for display and support search capabilities.
  • 21. Research Data Integration Ability to integrate with external data producers. For example SEAD, OSF…
  • 22. WEB UI HTML Javascript CSS Twitter Bootstrap Jquery and Plugins Facebook ReactJS Advanced REST Client Protocol Https Https/REST/JSON/JSON-LD Https/SOAP/XML * Middleware J2EE Application servers (Jetty, Apache Tomcat) Spring MVC Groovy/Grails Ruby/Rails Desktop UI Java Swings Java Web Start Batch Automation UI UC4 * Control M * Protocol Https Https/REST/JSON Java Network Launch Protocol (JNLP) Middleware J2EE Application servers (Jetty, Apache Tomcat) Spring MVC Spring Remoting & Web Services Protocol SSH Java RMI Scripting/Orchestration Shell Programming Perl Ruby Groovy Storage/Databases Network File Storage Oracle/MySQL/PostgreSQL Amazon Cloud Duraspace Cloud ESB/Message Brokers Apache Active MQ Rabbit MQ Apache Camel Source Code Management Git CVS * Productivity Tools Drupal/Confluence/Google Sites JIRA Bamboo Fisheye Crucible Stash Microsoft Office Operating Systems Servers (Linux) Desktop (Linux/Windows/Mac) Build Tools Ant Maven
  • 23.
  • 24. openICPSR scheduled to be released by end of Jul/2016 on new Archonnex platform.
  • 26. Thank you Thomas Murphy tomurphy@umich.edu CNS Director Harsha Ummerpillai harshau@umich.edu Software Architect