SlideShare a Scribd company logo
1 of 43
Download to read offline
Apache ManifoldCF
About me
● Open Source ECM Specialist at Sourcesence

● Author and Technical Reviewer at Packt Publishing
   ○ Alfresco 3 Web Services (2010)
   ○ GateIn Cookbook (2012)

● Alfresco Community (nickname OpenPj)
   ○ Alfresco Wiki Gardener
   ○ Top 10 supporter (english and italian)
   ○ Moderator of the italian forum

● PMC Member at the Apache Software Foundation

● JBoss Community
   ○ Content editor for jboss.org
   ○ Project Leader and Committer for PortletSwap
Overview
● The story
● What is ManifoldCF?
   ○ What is a repository?
   ○ What is a search server?
● Why ManifoldCF?
● Architecture
● The growing path
   ○ The 0.3-incubating version
   ○ The 0.4-incubating version
   ○ The 0.5-incubating version
   ○ The 0.6 version (graduated ^__^)
   ○ What's new in the 1.0.1 version
● The book: ManifoldCF in Action
● Demo
● Resources
The story

The original ManifoldCF code base was granted by MetaCarta
Inc., to the Apache Software Foundation in December 2009.

The MetaCarta effort represented more than five years of
successful development and testing in multiple, challenging
enterprise environments.

The project was graduated as Apache Top Level Project in July
2012.

                               ^__^
What is ManifoldCF?
● Open Source crawler
  ○ schedule jobs to create indexes
     ■ get contents from repositories
     ■ push contents on search servers
                                         Search Server
  Repository 1
                                               1
                                         Search Server
  Repository 2     Apache ManifoldCF           2
                                         Search Server
  Repository 3
                                               3
What is ManifoldCF?
● Open Source crawler
  ○ schedule jobs to create indexes
     ■ get contents from repositories
     ■ push contents on search servers

● Out-Of-The-Box it is distributed as J2EE web apps
  ○ REST API
  ○ Authority Service
  ○ Crawler UI

● Can be embedded in any Java application
What is a repository?
● Open Source crawler
  ○ schedule jobs to create indexes
     ■ get contents from repositories
     ■ push contents on search servers
                                         Search Server
  Repository 1
                                               1
                                         Search Server
  Repository 2     Apache ManifoldCF           2
                                         Search Server
  Repository 3
                                               3
What is a repository?
● central place where to put and get contents
● contents are kept is an organized way
    ○ ER model is the old way
    ○ Node graph
        ■ properties (metadata)
        ■ associations
        ■ renditions
● base component of Enterprise Content Management (ECM)
  systems
● is from Latin repositorium
    ○ table of service
    ○ vessel
    ○ chamber
    ○ where to keep and find your things!!!
Enterprise Content Management
Enterprise content management (ECM) is a formalized means of
organizing and storing an organization's documents, and other
content, that relate to the organization's processes. The term
encompasses strategies, methods, and tools used throughout the
lifecycle of the content.

                                                                    Wikipedia
                       http://en.wikipedia.org/wiki/Enterprise_content_management
Enterprise Content Management
      ECM



                         Enterprise services




                   Workflows and processes




                                  Users +
            Repository            groups       BPM
                                (LDAP, IDM)
What is a repository? - You use it!!!
● Some simple examples:
   ○ SMTP servers
   ○ Google Drive
   ○ Dropbox

● Some Open Source repository implementations:
   ○ exoJCR
   ○ Apache JackRabbit

● Some Open Source ECM systems for critical usage:
   ○ Alfresco
   ○ Nuxeo
   ○ Hippo
What is a repository? - Decoration
                    CMIS
                     JCR
                    REST
                    SOAP
                    IMAP
         apply      EMAIL
        metadata     FTP         retrieve content using
                                        metadata

                                   Query Languages:
                                          CMIS
                                        JCR SQL
                   Repository             XPath
                                         Lucene
                                 Full Text (Google style)




                                Indexes
What is a repository? - Architecture

     APIs (CMIS, REST, FTP, WebDAV, IMAP)

                       Model


                   Storage


       Content Store           Indexes
What is a repository? - Repo Model
 ● different point of view of how managing data
    ○ no more Relational databases (ER)
 ● repositories offers you an API!
 ● based on the JCR Repository Model (JSR-283)
    ○ workspaces
    ○ identifiers
    ○ users
    ○ nodes and node types (contents)
       ■ properties and property types
       ■ associations (shared nodes)
What is a repository? - Repo Model
 ● A node is a generic content stored in a repository
    ○ type
    ○ properties
    ○ associations
    ○ binary streams (optional)
       ■ renditions
          ■ text document
          ■ Video
          ■ Image
          ■. . .
What is a repository? - Repo Model
                    Properties
      Type          (metadata):

                    - name
             Node
                    - description
                    - mimetype
                    - tags
                    - categories

                                    Renditions




       Binary 1        Binary 2              Binary 3
What is a repository? - Repo Model
     Repository

              Workspace
                  1
            Workspace
                2

      Workspace             Root node
          3


              A                B
                                            C
        D               E               G
Why use a repository?
 ● adding new node types means to add a configuration
 ● you can scale out easily
 ● storing very large amounts of data
 ● storing simple data structures, such as simple JSON
   documents
 ● looking up data by keys rather than using queries
 ● searching for data based upon relevance rather than criteria
 ● evolving schemas and/or data structures
 ● caching data in-memory for performance
 ● giving up consistency guarantees for increased availability
Why use a repository?
 ● Standard API
    ○ Content Management Interoperability Services (CMIS)
    ○ Java Content Repository (JCR)
 ● Hierarchical structure
 ● Transaction support
 ● Versioning
 ● Locking
 ● Observation
 ● References
 ● Navigation services
    ○ parents
    ○ children
    ○ associated
 ● Search services
What is a search server?
● Open Source crawler
  ○ schedule jobs to create indexes
     ■ get contents from repositories
     ■ push contents on search servers
                                         Search Server
  Repository 1
                                               1
                                         Search Server
  Repository 2     Apache ManifoldCF           2
                                         Search Server
  Repository 3
                                               3
What is a search server?
A search server is an application that allows users to
find repository contents quickly using:

●   keywords (full text search)
●   content fields
●   tags
●   categories
●   ranking

The informations kept are indexes.
What is a search server?


             REST API


            Storage

             Indexes
Why ManifoldCF?
● Reliability
● Incremental
● Multi repositories
● Security model
● Monitoring
Why ManifoldCF? - Reliability
Jobs scheduling and configuration are stored in the
database to maintain the state of all the executions

     Repository       Pull Agent Daemon             Search Server
                     configuration and scheduling




                            Database
Why ManifoldCF? - Incremental

Jobs can be optionally configured to re-visit contents
incrementally
                   Repository




         N1
                                  Apache ManifoldCF




              N2
                           N4
Why ManifoldCF? - Multi repositories
Jobs can retrieve contents from the following repositories:
 ● CMIS-compliant
 ● Alfresco
 ● IBM FileNet
 ● EMC Documentum
 ● Microsoft SharePoint
 ● OpenText LiveLink
 ● Autonomy Meridio
 ● Memex Patriarch
 ● Windows Share/DFS
 ● Generic JDBC
 ● Generic Filesystem
 ● Generic RSS and Web
Why ManifoldCF? - Multi repositories
Jobs can ingest contents to the following search
servers:
● Apache Solr
● ElasticSearch
● OpenSearchServer
● MetaCarta GTS
Why ManifoldCF? - Security model
Retrieve per-content ACLs                      Authority 1

                        Authority Service      Authority 2

                                               Authority 3


       Repository 1

       Repository 2    Pull Agent Daemon
                                            user access
       Repository 3                           tokens
                      doc access
                        tokens
                                                  user specific
                         Search Server              search
                                                    results
Why ManifoldCF? - Monitoring
UI Crawler allows you to:
 ● configure jobs and connectors
 ● monitor jobs execution
 ● monitor contents ingestion
   ○ status reports
      ■ document status
      ■ queue status
   ○ history reports
      ■ simple history
      ■ maximum activity
      ■ maximum bandwidth
      ■ result histogram
Architecture

● Pull Agent Daemon
   ○ Jobs
      ■ Repository Connectors
      ■ Output Connectors
      ■ Authority Connectors
Architecture

● Pull Agent Daemon (the core service)
   ○ Jobs (execute the ingestion tasks)
      ■ Repository Connectors (retrieve contents)
      ■ Output Connectors (ingest contents)
      ■ Authority Connectors (retrieve ACLs)
Architecture
                  Authority Service



                                      Search Server
   Repository 1
                                            1
                                      Search Server
   Repository 2   Pull Agent Daemon
                                            2
                                      Search Server
   Repository 3
                                            3




                      Database
Architecture - Job

A job is an ingestion work that consists of:
    ○ verbal description
    ○ repository connection
       ■ authority connection (optional)
    ○ metadata mapping
    ○ output connection (search server)
    ○ crawling model
    ○ scheduling information (on demand or time ranges)
Architecture - Job
                                                         Authority
                                                         Connector
                                        ACLs
        Repository
        Connector
                            retrieve                        Output
                          content ACL                      Connector



      Repository                     Job                  Search Server

- query to retrieve contents                           - metadata mapping
                                - verbal description   - content ingestion
                                - crawling model
                                - scheduling
The 0.3-incubating version

● CMIS Repository Connector
● OpenSearchServer Output Connector
● Scripting Language
● New Maven build process
● Several bug fixes
The 0.4-incubating version

● Alfresco Connector
● JDBC Connector now supports MySQL
● CMIS Connector upgraded to OpenCMIS 0.5.0
● Several bug fixes
The 0.5-incubating version

● Apache Velocity for connectors UI templates
● ElasticSearch Output Connector
● CMIS Connector upgraded to OpenCMIS 0.6.0
● Prebuild connector support: just add jars and go!
● New Japanese localization
● Several bug fixes
The 0.6 version
●   Project moved to JDK 1.6
●   Many improvements for all the connectors
●   Updated the Apache Solr plugin
●   Several bugfixes
What's new in 1.0.1 version
●   Microsoft SharePoint 2010 support
●   JDBC Connector now manages metadata
●   CMIS Connector upgraded to OpenCMIS 0.7.0
●   Several bugfixes
The book: ManifoldCF in Action

ManifoldCF in Action
by Karl Wright
published by Manning


Karl is the original developer and the
principal committer of Apache ManifoldCF


The book is available at the following site:
http://www.manning.com/wright
DEMO
Resources

Homepage:
http://manifoldcf.apache.org



Download page:
http://manifoldcf.apache.org/en_US/download.html
Thank you for your
       attention!
          ^__^

http://www.open4dev.com

More Related Content

What's hot

Oslo Vancouver Project Update
Oslo Vancouver Project UpdateOslo Vancouver Project Update
Oslo Vancouver Project UpdateBen Nemec
 
Last Month in PHP - December 2015
Last Month in PHP - December 2015Last Month in PHP - December 2015
Last Month in PHP - December 2015Eric Poe
 
Introduction to Ruby Native Extensions and Foreign Function Interface
Introduction to Ruby Native Extensions and Foreign Function InterfaceIntroduction to Ruby Native Extensions and Foreign Function Interface
Introduction to Ruby Native Extensions and Foreign Function InterfaceOleksii Sukhovii
 
Efficient HTTP applications on the JVM with Ratpack - Voxxed Days Berlin 2016
Efficient HTTP applications on the JVM with Ratpack - Voxxed Days Berlin 2016Efficient HTTP applications on the JVM with Ratpack - Voxxed Days Berlin 2016
Efficient HTTP applications on the JVM with Ratpack - Voxxed Days Berlin 2016Alvaro Sanchez-Mariscal
 
Gr8Conf 2016 - GORM Inside and Out
Gr8Conf 2016 - GORM Inside and OutGr8Conf 2016 - GORM Inside and Out
Gr8Conf 2016 - GORM Inside and Outgraemerocher
 
LNUG - A year with AWS
LNUG - A year with AWSLNUG - A year with AWS
LNUG - A year with AWSAndrew Clarke
 
TiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup GroupTiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup GroupMorgan Tocker
 
Mime Magic With Apache Tika
Mime Magic With Apache TikaMime Magic With Apache Tika
Mime Magic With Apache TikaJukka Zitting
 
Introducing TiDB - Percona Live Frankfurt
Introducing TiDB - Percona Live FrankfurtIntroducing TiDB - Percona Live Frankfurt
Introducing TiDB - Percona Live FrankfurtMorgan Tocker
 
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk ServerUsing ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk ServerBizTalk360
 
From monolith web app to micro-frontends
From monolith web app to micro-frontendsFrom monolith web app to micro-frontends
From monolith web app to micro-frontendsRustam Aliyev
 
Gr8Conf 2016 - What's new in Grails 3
Gr8Conf 2016 - What's new in Grails 3Gr8Conf 2016 - What's new in Grails 3
Gr8Conf 2016 - What's new in Grails 3graemerocher
 
Manuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octManuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octParadigma Digital
 
ストリーミングデータのアドホック分析エンジンの比較
ストリーミングデータのアドホック分析エンジンの比較ストリーミングデータのアドホック分析エンジンの比較
ストリーミングデータのアドホック分析エンジンの比較Yoshiyasu SAEKI
 
React Component in scala.js
React Component in scala.jsReact Component in scala.js
React Component in scala.jsUnfold UI
 
Tear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormation
Tear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormationTear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormation
Tear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormationJames Andrew Vaughn
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkkbajda
 

What's hot (20)

Oslo Vancouver Project Update
Oslo Vancouver Project UpdateOslo Vancouver Project Update
Oslo Vancouver Project Update
 
Last Month in PHP - December 2015
Last Month in PHP - December 2015Last Month in PHP - December 2015
Last Month in PHP - December 2015
 
Introduction to Ruby Native Extensions and Foreign Function Interface
Introduction to Ruby Native Extensions and Foreign Function InterfaceIntroduction to Ruby Native Extensions and Foreign Function Interface
Introduction to Ruby Native Extensions and Foreign Function Interface
 
Efficient HTTP applications on the JVM with Ratpack - Voxxed Days Berlin 2016
Efficient HTTP applications on the JVM with Ratpack - Voxxed Days Berlin 2016Efficient HTTP applications on the JVM with Ratpack - Voxxed Days Berlin 2016
Efficient HTTP applications on the JVM with Ratpack - Voxxed Days Berlin 2016
 
Gr8Conf 2016 - GORM Inside and Out
Gr8Conf 2016 - GORM Inside and OutGr8Conf 2016 - GORM Inside and Out
Gr8Conf 2016 - GORM Inside and Out
 
LNUG - A year with AWS
LNUG - A year with AWSLNUG - A year with AWS
LNUG - A year with AWS
 
TiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup GroupTiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup Group
 
Mime Magic With Apache Tika
Mime Magic With Apache TikaMime Magic With Apache Tika
Mime Magic With Apache Tika
 
Introducing ELK
Introducing ELKIntroducing ELK
Introducing ELK
 
Flywaydb
FlywaydbFlywaydb
Flywaydb
 
Introducing TiDB - Percona Live Frankfurt
Introducing TiDB - Percona Live FrankfurtIntroducing TiDB - Percona Live Frankfurt
Introducing TiDB - Percona Live Frankfurt
 
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk ServerUsing ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
 
From monolith web app to micro-frontends
From monolith web app to micro-frontendsFrom monolith web app to micro-frontends
From monolith web app to micro-frontends
 
Gr8Conf 2016 - What's new in Grails 3
Gr8Conf 2016 - What's new in Grails 3Gr8Conf 2016 - What's new in Grails 3
Gr8Conf 2016 - What's new in Grails 3
 
Manuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4octManuel Hurtado. Couchbase paradigma4oct
Manuel Hurtado. Couchbase paradigma4oct
 
ストリーミングデータのアドホック分析エンジンの比較
ストリーミングデータのアドホック分析エンジンの比較ストリーミングデータのアドホック分析エンジンの比較
ストリーミングデータのアドホック分析エンジンの比較
 
What's New In Rails 4.2
What's New In Rails 4.2What's New In Rails 4.2
What's New In Rails 4.2
 
React Component in scala.js
React Component in scala.jsReact Component in scala.js
React Component in scala.js
 
Tear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormation
Tear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormationTear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormation
Tear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormation
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talk
 

Similar to Apache ManifoldCF @ Linux Day 2012

The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - RomeThe ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - RomePiergiorgio Lucidi
 
Apache2 BootCamp : Understanding Apache Internals
Apache2 BootCamp : Understanding Apache InternalsApache2 BootCamp : Understanding Apache Internals
Apache2 BootCamp : Understanding Apache InternalsWildan Maulana
 
2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_MicroservicesJason Varghese
 
Decoupling Content Management with Create.js and PHPCR
Decoupling Content Management with Create.js and PHPCRDecoupling Content Management with Create.js and PHPCR
Decoupling Content Management with Create.js and PHPCRHenri Bergius
 
#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPRCamille Salas
 
BlackRay - The open Source Data Engine
BlackRay - The open Source Data EngineBlackRay - The open Source Data Engine
BlackRay - The open Source Data Enginefschupp
 
Introduction to rook
Introduction to rookIntroduction to rook
Introduction to rookRohan Gupta
 
Restful风格ž„web服务架构
Restful风格ž„web服务架构Restful风格ž„web服务架构
Restful风格ž„web服务架构Benjamin Tan
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónMongoDB
 
Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009fschupp
 
Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators DoKC
 
Eclipse Enterprise Content Repository (ECR)
Eclipse Enterprise Content Repository (ECR)Eclipse Enterprise Content Repository (ECR)
Eclipse Enterprise Content Repository (ECR)Florent Guillaume
 
Experiences with Evangelizing Java Within the Database
Experiences with Evangelizing Java Within the DatabaseExperiences with Evangelizing Java Within the Database
Experiences with Evangelizing Java Within the DatabaseMarcelo Ochoa
 
Implementing a JSR-283 Content Repository in PHP
Implementing a JSR-283 Content Repository in PHPImplementing a JSR-283 Content Repository in PHP
Implementing a JSR-283 Content Repository in PHPKarsten Dambekalns
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpJosé Román Martín Gil
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)Igor Talevski
 
Running MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWSRunning MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWSMongoDB
 
Melbourne User Group OAK and MongoDB
Melbourne User Group OAK and MongoDBMelbourne User Group OAK and MongoDB
Melbourne User Group OAK and MongoDBYuval Ararat
 
Application of Library Management Software: NewGenLib
Application of Library Management Software: NewGenLibApplication of Library Management Software: NewGenLib
Application of Library Management Software: NewGenLibDavid Nzoputa Ofili
 
JCDL 2016 Doctoral Consortium - Web Archive Profiling
JCDL 2016 Doctoral Consortium - Web Archive ProfilingJCDL 2016 Doctoral Consortium - Web Archive Profiling
JCDL 2016 Doctoral Consortium - Web Archive ProfilingSawood Alam
 

Similar to Apache ManifoldCF @ Linux Day 2012 (20)

The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - RomeThe ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
 
Apache2 BootCamp : Understanding Apache Internals
Apache2 BootCamp : Understanding Apache InternalsApache2 BootCamp : Understanding Apache Internals
Apache2 BootCamp : Understanding Apache Internals
 
2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices
 
Decoupling Content Management with Create.js and PHPCR
Decoupling Content Management with Create.js and PHPCRDecoupling Content Management with Create.js and PHPCR
Decoupling Content Management with Create.js and PHPCR
 
#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR
 
BlackRay - The open Source Data Engine
BlackRay - The open Source Data EngineBlackRay - The open Source Data Engine
BlackRay - The open Source Data Engine
 
Introduction to rook
Introduction to rookIntroduction to rook
Introduction to rook
 
Restful风格ž„web服务架构
Restful风格ž„web服务架构Restful风格ž„web服务架构
Restful风格ž„web服务架构
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producción
 
Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009
 
Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators
 
Eclipse Enterprise Content Repository (ECR)
Eclipse Enterprise Content Repository (ECR)Eclipse Enterprise Content Repository (ECR)
Eclipse Enterprise Content Repository (ECR)
 
Experiences with Evangelizing Java Within the Database
Experiences with Evangelizing Java Within the DatabaseExperiences with Evangelizing Java Within the Database
Experiences with Evangelizing Java Within the Database
 
Implementing a JSR-283 Content Repository in PHP
Implementing a JSR-283 Content Repository in PHPImplementing a JSR-283 Content Repository in PHP
Implementing a JSR-283 Content Repository in PHP
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)
 
Running MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWSRunning MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWS
 
Melbourne User Group OAK and MongoDB
Melbourne User Group OAK and MongoDBMelbourne User Group OAK and MongoDB
Melbourne User Group OAK and MongoDB
 
Application of Library Management Software: NewGenLib
Application of Library Management Software: NewGenLibApplication of Library Management Software: NewGenLib
Application of Library Management Software: NewGenLib
 
JCDL 2016 Doctoral Consortium - Web Archive Profiling
JCDL 2016 Doctoral Consortium - Web Archive ProfilingJCDL 2016 Doctoral Consortium - Web Archive Profiling
JCDL 2016 Doctoral Consortium - Web Archive Profiling
 

More from Piergiorgio Lucidi

Embracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital TransformationEmbracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital TransformationPiergiorgio Lucidi
 
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community Piergiorgio Lucidi
 
Smart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success StorySmart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success StoryPiergiorgio Lucidi
 
Design your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process ServicesDesign your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process ServicesPiergiorgio Lucidi
 
Smart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCFSmart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCFPiergiorgio Lucidi
 
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 ItalyAlfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 ItalyPiergiorgio Lucidi
 
The Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's SuccessesThe Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's SuccessesPiergiorgio Lucidi
 
Implementing portlets using Web Scripts
Implementing portlets using Web ScriptsImplementing portlets using Web Scripts
Implementing portlets using Web ScriptsPiergiorgio Lucidi
 
Alfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - SourcesenseAlfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - SourcesensePiergiorgio Lucidi
 
Alfresco Summit 2014 - Crafter CMS - Case European Bank
Alfresco Summit 2014 - Crafter CMS - Case European BankAlfresco Summit 2014 - Crafter CMS - Case European Bank
Alfresco Summit 2014 - Crafter CMS - Case European BankPiergiorgio Lucidi
 

More from Piergiorgio Lucidi (13)

Embracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital TransformationEmbracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital Transformation
 
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
 
Smart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success StorySmart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success Story
 
Design your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process ServicesDesign your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process Services
 
Smart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCFSmart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCF
 
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 ItalyAlfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
 
The Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's SuccessesThe Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's Successes
 
Implementing portlets using Web Scripts
Implementing portlets using Web ScriptsImplementing portlets using Web Scripts
Implementing portlets using Web Scripts
 
Alfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - SourcesenseAlfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - Sourcesense
 
Alfresco Summit 2014 - Crafter CMS - Case European Bank
Alfresco Summit 2014 - Crafter CMS - Case European BankAlfresco Summit 2014 - Crafter CMS - Case European Bank
Alfresco Summit 2014 - Crafter CMS - Case European Bank
 
Hippo CMS - A first look
Hippo CMS - A first lookHippo CMS - A first look
Hippo CMS - A first look
 
Spring Ldap
Spring LdapSpring Ldap
Spring Ldap
 
Spring In Alfresco Ecm
Spring In Alfresco EcmSpring In Alfresco Ecm
Spring In Alfresco Ecm
 

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Apache ManifoldCF @ Linux Day 2012

  • 2. About me ● Open Source ECM Specialist at Sourcesence ● Author and Technical Reviewer at Packt Publishing ○ Alfresco 3 Web Services (2010) ○ GateIn Cookbook (2012) ● Alfresco Community (nickname OpenPj) ○ Alfresco Wiki Gardener ○ Top 10 supporter (english and italian) ○ Moderator of the italian forum ● PMC Member at the Apache Software Foundation ● JBoss Community ○ Content editor for jboss.org ○ Project Leader and Committer for PortletSwap
  • 3. Overview ● The story ● What is ManifoldCF? ○ What is a repository? ○ What is a search server? ● Why ManifoldCF? ● Architecture ● The growing path ○ The 0.3-incubating version ○ The 0.4-incubating version ○ The 0.5-incubating version ○ The 0.6 version (graduated ^__^) ○ What's new in the 1.0.1 version ● The book: ManifoldCF in Action ● Demo ● Resources
  • 4. The story The original ManifoldCF code base was granted by MetaCarta Inc., to the Apache Software Foundation in December 2009. The MetaCarta effort represented more than five years of successful development and testing in multiple, challenging enterprise environments. The project was graduated as Apache Top Level Project in July 2012. ^__^
  • 5. What is ManifoldCF? ● Open Source crawler ○ schedule jobs to create indexes ■ get contents from repositories ■ push contents on search servers Search Server Repository 1 1 Search Server Repository 2 Apache ManifoldCF 2 Search Server Repository 3 3
  • 6. What is ManifoldCF? ● Open Source crawler ○ schedule jobs to create indexes ■ get contents from repositories ■ push contents on search servers ● Out-Of-The-Box it is distributed as J2EE web apps ○ REST API ○ Authority Service ○ Crawler UI ● Can be embedded in any Java application
  • 7. What is a repository? ● Open Source crawler ○ schedule jobs to create indexes ■ get contents from repositories ■ push contents on search servers Search Server Repository 1 1 Search Server Repository 2 Apache ManifoldCF 2 Search Server Repository 3 3
  • 8. What is a repository? ● central place where to put and get contents ● contents are kept is an organized way ○ ER model is the old way ○ Node graph ■ properties (metadata) ■ associations ■ renditions ● base component of Enterprise Content Management (ECM) systems ● is from Latin repositorium ○ table of service ○ vessel ○ chamber ○ where to keep and find your things!!!
  • 9. Enterprise Content Management Enterprise content management (ECM) is a formalized means of organizing and storing an organization's documents, and other content, that relate to the organization's processes. The term encompasses strategies, methods, and tools used throughout the lifecycle of the content. Wikipedia http://en.wikipedia.org/wiki/Enterprise_content_management
  • 10. Enterprise Content Management ECM Enterprise services Workflows and processes Users + Repository groups BPM (LDAP, IDM)
  • 11. What is a repository? - You use it!!! ● Some simple examples: ○ SMTP servers ○ Google Drive ○ Dropbox ● Some Open Source repository implementations: ○ exoJCR ○ Apache JackRabbit ● Some Open Source ECM systems for critical usage: ○ Alfresco ○ Nuxeo ○ Hippo
  • 12. What is a repository? - Decoration CMIS JCR REST SOAP IMAP apply EMAIL metadata FTP retrieve content using metadata Query Languages: CMIS JCR SQL Repository XPath Lucene Full Text (Google style) Indexes
  • 13. What is a repository? - Architecture APIs (CMIS, REST, FTP, WebDAV, IMAP) Model Storage Content Store Indexes
  • 14. What is a repository? - Repo Model ● different point of view of how managing data ○ no more Relational databases (ER) ● repositories offers you an API! ● based on the JCR Repository Model (JSR-283) ○ workspaces ○ identifiers ○ users ○ nodes and node types (contents) ■ properties and property types ■ associations (shared nodes)
  • 15. What is a repository? - Repo Model ● A node is a generic content stored in a repository ○ type ○ properties ○ associations ○ binary streams (optional) ■ renditions ■ text document ■ Video ■ Image ■. . .
  • 16. What is a repository? - Repo Model Properties Type (metadata): - name Node - description - mimetype - tags - categories Renditions Binary 1 Binary 2 Binary 3
  • 17. What is a repository? - Repo Model Repository Workspace 1 Workspace 2 Workspace Root node 3 A B C D E G
  • 18. Why use a repository? ● adding new node types means to add a configuration ● you can scale out easily ● storing very large amounts of data ● storing simple data structures, such as simple JSON documents ● looking up data by keys rather than using queries ● searching for data based upon relevance rather than criteria ● evolving schemas and/or data structures ● caching data in-memory for performance ● giving up consistency guarantees for increased availability
  • 19. Why use a repository? ● Standard API ○ Content Management Interoperability Services (CMIS) ○ Java Content Repository (JCR) ● Hierarchical structure ● Transaction support ● Versioning ● Locking ● Observation ● References ● Navigation services ○ parents ○ children ○ associated ● Search services
  • 20. What is a search server? ● Open Source crawler ○ schedule jobs to create indexes ■ get contents from repositories ■ push contents on search servers Search Server Repository 1 1 Search Server Repository 2 Apache ManifoldCF 2 Search Server Repository 3 3
  • 21. What is a search server? A search server is an application that allows users to find repository contents quickly using: ● keywords (full text search) ● content fields ● tags ● categories ● ranking The informations kept are indexes.
  • 22. What is a search server? REST API Storage Indexes
  • 23. Why ManifoldCF? ● Reliability ● Incremental ● Multi repositories ● Security model ● Monitoring
  • 24. Why ManifoldCF? - Reliability Jobs scheduling and configuration are stored in the database to maintain the state of all the executions Repository Pull Agent Daemon Search Server configuration and scheduling Database
  • 25. Why ManifoldCF? - Incremental Jobs can be optionally configured to re-visit contents incrementally Repository N1 Apache ManifoldCF N2 N4
  • 26. Why ManifoldCF? - Multi repositories Jobs can retrieve contents from the following repositories: ● CMIS-compliant ● Alfresco ● IBM FileNet ● EMC Documentum ● Microsoft SharePoint ● OpenText LiveLink ● Autonomy Meridio ● Memex Patriarch ● Windows Share/DFS ● Generic JDBC ● Generic Filesystem ● Generic RSS and Web
  • 27. Why ManifoldCF? - Multi repositories Jobs can ingest contents to the following search servers: ● Apache Solr ● ElasticSearch ● OpenSearchServer ● MetaCarta GTS
  • 28. Why ManifoldCF? - Security model Retrieve per-content ACLs Authority 1 Authority Service Authority 2 Authority 3 Repository 1 Repository 2 Pull Agent Daemon user access Repository 3 tokens doc access tokens user specific Search Server search results
  • 29. Why ManifoldCF? - Monitoring UI Crawler allows you to: ● configure jobs and connectors ● monitor jobs execution ● monitor contents ingestion ○ status reports ■ document status ■ queue status ○ history reports ■ simple history ■ maximum activity ■ maximum bandwidth ■ result histogram
  • 30. Architecture ● Pull Agent Daemon ○ Jobs ■ Repository Connectors ■ Output Connectors ■ Authority Connectors
  • 31. Architecture ● Pull Agent Daemon (the core service) ○ Jobs (execute the ingestion tasks) ■ Repository Connectors (retrieve contents) ■ Output Connectors (ingest contents) ■ Authority Connectors (retrieve ACLs)
  • 32. Architecture Authority Service Search Server Repository 1 1 Search Server Repository 2 Pull Agent Daemon 2 Search Server Repository 3 3 Database
  • 33. Architecture - Job A job is an ingestion work that consists of: ○ verbal description ○ repository connection ■ authority connection (optional) ○ metadata mapping ○ output connection (search server) ○ crawling model ○ scheduling information (on demand or time ranges)
  • 34. Architecture - Job Authority Connector ACLs Repository Connector retrieve Output content ACL Connector Repository Job Search Server - query to retrieve contents - metadata mapping - verbal description - content ingestion - crawling model - scheduling
  • 35. The 0.3-incubating version ● CMIS Repository Connector ● OpenSearchServer Output Connector ● Scripting Language ● New Maven build process ● Several bug fixes
  • 36. The 0.4-incubating version ● Alfresco Connector ● JDBC Connector now supports MySQL ● CMIS Connector upgraded to OpenCMIS 0.5.0 ● Several bug fixes
  • 37. The 0.5-incubating version ● Apache Velocity for connectors UI templates ● ElasticSearch Output Connector ● CMIS Connector upgraded to OpenCMIS 0.6.0 ● Prebuild connector support: just add jars and go! ● New Japanese localization ● Several bug fixes
  • 38. The 0.6 version ● Project moved to JDK 1.6 ● Many improvements for all the connectors ● Updated the Apache Solr plugin ● Several bugfixes
  • 39. What's new in 1.0.1 version ● Microsoft SharePoint 2010 support ● JDBC Connector now manages metadata ● CMIS Connector upgraded to OpenCMIS 0.7.0 ● Several bugfixes
  • 40. The book: ManifoldCF in Action ManifoldCF in Action by Karl Wright published by Manning Karl is the original developer and the principal committer of Apache ManifoldCF The book is available at the following site: http://www.manning.com/wright
  • 41. DEMO
  • 43. Thank you for your attention! ^__^ http://www.open4dev.com