Site up an open access-ICAR
Institutional Repository-Hardware, Software, Policies and Personnel.
ICAR Initiatives
Under NATP Project – Integrated National Agricultural Resources Information System INARIS (Rai et. Al., 2007). A Central Data warehouse (CWD) of agricultural resources was established at IASRI
This project having collaborations with 13 other organizations of ICAR.
In this view 13 different data marts were designed.
This Project was available under this link (http://agdw.iasri.res.in)
My outlook Country should have agri-search engine
Agri-Search Engine should be developed in country to aggregate information from the internet and provide it to farmers in meaningful manner through using ICT tools.
Agri-Search Engine be coordinated with Govt. of India’s Agricultural Websites to monitor each website per day.
Call On 6297143586 Viman Nagar Call Girls In All Pune 24/7 Provide Call With...
Prototype Design of Open Access Institutional Repository
1. Site up an open access
Institutional Repository
-Hardware, Software, Policies and
Personnel
Deepak Sharma – DMR, Solan
2. Presentation Covers following items:
Abbreviations
What is an open access?
What is an Institutional Repository?
Purpose of IR/Why Repository?
What contents should be in IR?
Repository Users
Conceptual components of repository
How repository work
Used technology, standards & Protocol
Advantages of OA Repositories
Requirement of IR-Hardware, Software, Policy & Personnel
Some Interesting Facts about repositoriesSome Interesting Facts about repositories
World Scenario of IR
India & China
ICARinitiatives
My outlook
Prototype of DMRRepository
Proposed Layout DiagramforAKMU-DMR
4. Abbreviations..
OA : Open Access.
OAJ: Open Access Journals
OSS : Open Source Software - software’s available free orwith limited restrictions
Semantic Web Tools: Tools used fordesign Search Engines
Data Mart : A departmental data depository
Repository : Online Storage location forsafely preserve data.
ETL : Extract, Transform, Load (Staging Area server, ETL Server)
Extract: Collect data fromdifferent Sources
Transform: Data format conversion
Load : Initial Data Load
Relevant datais extractedfromdifferent sources, transformed
andfinallyloadedinto datawarehouse.
5. Central Harvester: Single spot having Central Data
warehouse, uses ETL forautomatic data harvesting
fromassociated OA repositories. (ATM/Bankers, railways, air lines harvest
data real time; regularbasis - others organisations can harvest data weekly or monthly basis)
All CSIRlab's repositories are being regularly
harvested by the centralharvester.
Self Archiving (google scholar): Act of depositing a free copy
of a digital document on the local data mart, orin
(W3* ) orderto provide open access to it.
Promulgations would Infringe : Breakof open
declaration agreement.
Abbreviations.. …cont’d
*www
6. Digital Preservation : Digital preservation is the
active management of digital information overtime to
ensure its accessibility. Digital Preservation is more
challenging than bookkeeping.
Embargo : In academic publishing, an embargo is a
periodduring which access is not allowed to certain
types of users (Denial of access forspecific userforspecific period).
Business Intelligence (BI): An environment in which
business users receive information that is reliable, secure,
consistent, understandable, easily manipulated and timely
and used forconduct analyses.
Abbreviations.. …cont’d
7. Data Warehouse (DW) : A collection of data pulled
togetherlargely fromoperational systems
(repositories) and specifically structured and tuned
foreasy access and use forquery, reportingand
analysis purposes.
The Data Warehouse may contain historical
data,transactional data, collected data and
derived/calculated data.
The data in a DWare time varying with changes
happening in a controlled (well ordered) form.
Abbreviations.. …cont’d
8. Data Mart : Data Marts usually include a subset of the
organization's data and are focused on a specific subject area or
business area.
WalMart: Walmart usually include a subset of the retail stores.
Data Marts are developed to support a single business function or
process. Different business functions should have different data
mart.
The data mart is typically brief and collective.
Metadata : ‘the documentation of data' The bibliographic details
such as authornames, institutional affiliation, date, titles of the
article, abstract and so forth that is captured and maintained
with the purpose of supporting the development, administration
and usage/navigation of the BI environment.
Abbreviations.. …cont’d
9. Text Mining : Text mining, also referred to as text data
mining, equivalent to text analytics, process of high-quality
information fromtext, through develop a patterns and trends
such as statistical pattern learning.
Web Mining : extracting data fromWeb pages is called Web
Scraping orWeb DataMining.
Data Mining: data mining (sometimes called data or
knowledge discovery) is the process of analyzing data from
different perspectives and summarizing it into useful
information.
Use of Data Mining: Artificial neural networks (artificial neural
networks are computational models inspired by animal central nervous systems that are capable of
machine learning and pattern recognition), Genetic algorithms-AIusedto
generateusefulsearchsolutions, Decision trees, Nearest
neighbormethod, Rule induction, Data visualization
Abbreviations.. …cont’d
10. What is an open Access
Open access (OA) is the practice of providing
unrestricted access via the Internet to peer-
reviewed scholarly research. It is most commonly
applied to scholarly journal articles, but it is also
increasingly being provided to theses, bookchapters,
andscholarlymonographs – (detailed written study of a single specialized
subject)
http://www.openoasis.org
SymbolofOpenAccess
Scholarly or peer-reviewed articles are written by experts in academic or
professional fields. They are excellent sources for finding out what has been
studied or researched on a topic as well as to find bibliographic information.
11. An institutional repository is an online location for
collecting, preserving, disseminating of the intellectual
output of an Institutes.
(Online Storage location forsafely preserve data)
Institutional based
Long termdigital preservation
Scholarly knowledge materials in digital formats
Access and distribution of its intellectual (Knowledgeable)
assets
Collective and permanent
Open and interoperable
What is an Institutional Repository?
14. There are many purposes
Open access
Resource discovery
Dissemination of research widely
Research evaluation and assessment
Institutional and personal impact
Information asset management by institutions
Process improvements – store once, use many
Preserve easily lost gray doc. ("grey” – written doc.
e.g. literature such as theses ortechnical reports).
What Purpose of IR?
15. It can include …
Theses, Journals, Research Papers,
Conference proceedings, documents..
Presentations – Photo, PPT, Audio, Video
Metadata (Dictionary, Index of data)
Bookchapters
Digitized material
Research data, annual reports etc.
What Contents should be in Repositories?
17. 3. Data
Warehouse
DMR Data
Mart-1
DMR Data
Mart-2
DMR Data
Mart-3
6 Data Mining
(Data
Extract
,Transform,
Analysis)
Conceptual components around repositories
Data Mart of
Different
Department
PaaS,SaaS, IaaS
e.g. Google Drive etc.
4
5
18. Universities/Institutes aroundtheworldusetheserepositoryinthe
followingways:
•Provide a central archive of theirworkof Institute
•Managing collections of research documents
•Preserving digital materials forlong term
•Communicate among Scholars to increase impact of the
research
•Storing learning materials and courseware (IASRI-stat.iasri.res.in)
•Electronic publishing-(Digitized form)
•Knowledge management
•Research assessment
•Encouraging open access to scholarly research
Why Repositories?
Important..
19. Material submission (self Archive)
Metadata application (data indexing)
Access control (user’s access level)
Discovery support (support search engine)
Distribution (sharing data)
Preservation (data storage forlong term)
What are the core functionality of the
repository ?
20. • Material is hosted and managed on anInstitutional
RepositoryServer, using IRsoftware.
• Accessible on the organizational LAN (intranet) +
Internet (high speed dedicated broadband required).
• Scientists use a web browserto submit (deposit)
research material and also search the repository
• Through OAI* protocol, a central harvesters search
data to ‘Harvests” metadata from individual IR’s,
builds a cross-index and provides single point cross-
repository search service
• Security concerns could be handled at network, IR
and publication level
*OpenArchives Initiative- Protocol forMetadataHarvesting.
How repositories work?
21. IRsoftware (Open Source/Commercial)
OAI-PMHharvesting protocol/software (Free)
Servers forIR
Linux/Red Hat OS, MySQL/PostGress DBMS,
Apache/Tomcat web server, Perl/Java
Standards data Formats :pdf, MS Word, MS
PPT, JPEG, MPEG, HTML, GIF
Used Technologies
23. •The common protocol to which they all obey is called the Open
Archives Initiative - Protocol forMetadata Harvesting (OAI-PMH).
•The contents of all repositories are then indexed by Web search
engines such as Google and Google Scholar, creating online free
OA databases.
•Self-archiving (the process by which authors deposit theirwork
in repositories) grows the Open Access and populate large
proportion of the scholarly literature.
•Google and otherWeb search engines index OA repositories, the
contents are available to all through Web access.
Archives : Acollectionof historical documents
Protocol Used
24. •OpenAccess benefits researchers, institutions, nations andsociety
as awhole.
•Open Access provides the material fordesign new semantic web
tools fordata-mining and text-mining can work, generating new
knowledge from existing findings -(Datawarehouse, Datamart,
Matadata).
•Traditionally, journals sold to libraries as age of print-on-
paperthis was the only model available that enabled
publishers to disseminate journals and earn the cost.
•Now the age of www, research findings disseminated free of
charge to anyone who wishes to read them.
Advantages of Open Access
26. Services available
• Do it yourself (in house)
• Standard packages (Customizable)
• External Hosting (PaaS)
PaaS: Platformas aService
SaaS: Softwareas aService
e.g. clustermapinDMRwebsite
IaaS: Infrastructureas aService
CloudComputing-SMU, MScIT
Cloud
Computing
27. Do it yourself
• Advantages
Customized to fit well
Total control of program
Software can always be modified
• Disadvantages
Non Availability of Qualified staff to design and
maintain services
Staff turnover(IT staffing ) and problems of
upgrades
Long term maintenance
Cost of hardware and its maintenance
28. Standard Packages
• Advantages
Ready made solutions
Otherfunctionalities that might be useful foryou
Support fromcommunity of users and IT personnels
Long term maintenance
• Disadvantages
Updates may require customizing /re-setup
No control overwhere upgrades may end up
Cost of hardware and its maintenance
Staff training
29. Hosted Services (PaaS)
Advantages
Ready made solution only needs data input
Regularupgrades with little input
Very minimal staff commitment
Probably easiest and fastest to setup
May contain more functionalities
Cost of Equipment and its maintenance
Disadvantages
No control overimprovements-(ICARwebsite Drupal Based)
Security concerns
30. Based on the needs there are three types of options
available:
Open Source Software: The software is free to download. It
is open forchanges fromthe developers community
(CDSware, DSpace, EPrints, Fedora, Greenstone).
Commercial Software: We have to pay forthe software. The
software vendorkeeps, creates, and maintains the source
code.
Software Service Model: A software vendorkeeps and
distributes as (SaaS, PaaS, IaaS – concept from cloud
computing), oralso hosts and manages yourdata foryou.
(Vendors:Open Repository orbepress).
Choosing repository software
32. Hardware Concerns
• Interface with software
functionalities requirement
Softwarerequirements
• Technical Support
• Cost of maintenance
• Quantumof data input –grid computing required
• Size, format of individual articles
• Serverspace –physicalstorage
Storage
capacity
Compatibility issues
arises
33. Hardware concerns …cont’d
Networkavailability and reliability
Powersupply and appropriate back-up
Onsite/Offsite back-up hardware
o External hard disc
o Extra servers
e.g. Two sites - Online Dspace Server or Offline Dspace Server
36. • Data Policy forfull-text and otherfull data items
•
Content Policy fortypes of contents to set up
•
Submission Policy concerning depositors, quality
& copyright, and level of user& theirrights.
•
Preservation Policy : Digital preservation
• Digital preservation is just as big issue as
Hardware getting out of date, data corruption
and data format out of date
Metadata Policy fordescribing items in the
repository
37. Three policy areas need to be addressed in relation to repositories:
collection, management, and access.
1.Collection
•What types of materials will be accepted into the repository?
•Whose workcan be included in the repository?
•Defining criteria to set up a collection in the repository.
•Who controls sets, and who will authorizes membership?
•How will the repository be structured – around individual authors –
(Library System), orby department (Departmental Data Mart), research
division, etc.
•Are collections of content built around department oran individual?
•Who will deposit content? (library staff orauthors)
Developing Repository Policies
38. 2. Management
• General rights and responsibilities those who create
collections of digital content.
• What types of metadata will be used.
• What preservation activities will be undertaken.
3. Access
• Privacy policy forregistered users of the system.
• Will the repository restrict access to content if
requested by author?
• Will the repository enable embargo (access denied for
certain userforcertain period) periods forcontent?
Developing Repository Policies …cont’d 1
39. Policy Concerns
• Suggestion -What be suitable forinput
• type of information
• formats
• Usergroups/communities
• Who will be eligible to input - submitters
• Workflows
• Who gives final OK– editors
• Preservation and withdrawal
• How long will inputs be kept
41. Policies …cont’d
• Metadata standards (Documentations Standards of repositories)
• Final and most important requirement
– Institutional Commitment
IRMust be documented continuity afterchange in
leadership.
42. Personnel
• Expertise
Information specialist /metadata specialist
Data warehouse specialist
IT staff
Othersupport staff
• Salaries
• Staff training
limited knowledge of Data warehouse, Data Mining, Cloud Computing and open
source between IT Staff.
44. External Hosting services (PaaS)
• Digital Commons BE Press - Digital Commons helps
institutions collect, showcase, and preserve scholarly output
• EPrints Services - The EPrints Services teamoffers a complete
range of advice and consultancy to support institutions who have
adopted, orwho are looking to adopt, the EPrints solution
• ExLibris DigiTool - DigiTool enables academic libraries and
library consortia to manage and provide access to digital resources
• Intrallect - intraLibrary is a web-based repository formanaging
digital objects
45. External Hosting services…cont’d
• Open Repository - Open Repository is a service fromBioMed
Central to build, launch, host and maintain institutional repositories
fororganisations
• PublicationsList.org - A commercial public repository with
facilities to maintain publications lists (with optional full texts) for
individuals, groups, departments, ororganisations
• VTLS VITAL - VITAL is an institutional repository solution
designed foruniversities, libraries, museums, archives and
information centers, built on Fedora
• OCLC - CONTENTdm - CONTENTdmis a single software
solution that handles the storage, management and delivery of your
library’s digital collections to the Web
47. The Open Access (OA) movement has been a topic of
majordebate and interest around the world, in the
developing countries it has been seen as an
extraordinary opportunity to provide equality of access
to essential research information and raising awareness
of national research.
The problems that developing countries have always
faced with respect to research information are :
•The inability to afford subscriptions to journals creates
knowledge gap between researchers.
•The inability to integrate national research into the
global knowledge pool.
Open access in developing countries
48. The scholarly knowledge arising fromtheir
own research – critical forthe development
of appropriate programmes to solve global
health and environmental problems such as
infectious new diseases, climate change,
agricultural security ‘missing’ due to
financial restrictions limiting the
publication and distribution of national
research literature.
49. Technical staff may manage the following aspects of
the service delivery:
• Service availability (24/7)
• Scalability (growth)
• Backup and recovery
• Systemmaintenance, competition, migration
• Extensibility: access to otherinstitutes resources,
• Customization
• Internationalization (Multilanguage)
• Data loading location
Technical Issues once a service is running
51. Scope of Open Access
Repositories DataLast Harvested21April, 2013
•Total 2841 repositories around the world.
•There are 26,498,237items held in world repositories.
•India having 71registered repositories designed with
different platforms. China having 69till date.
•Past three years the numberhas been growing at an
average rate of one repository perday.
•Index of Open Access Repositories-ROAR.
•Directory of Open Access Repositories –OpenDOAR.
Repositories are also shown on a world map at
Source: Repository66 - http://maps.repository66.org/
52. UnderNATPProject – Integrated National
Agricultural Resources Information System
INARIS (Rai et. Al., 2007). A Central Data
warehouse (CWD) of agricultural resources was
established at IASRI
This project having collaborations with 13
otherorganizations of ICAR.
In this view 13 different data marts were
designed.
This Project was available underthis link
(http://agdw.iasri.res.in)
ICAR Initiatives Data Warehouse
53. Successful Open Access
Repository
ICAR, India
http://agropedia.iitk.ac.in/
Agropedia is developed underthe sponsorship of ICAR, NAIP
on
(SaaS Platform)
ConsortiaPartners are:
61. Adhoc-reports
Institutional OA
Matadata Repository
Researchers, academics, professors
Mata Data
Data Dictionary, keeps indexes of data
Mata Data
Mata Data
Mata Data
Projects Data Marts
Disease Data Marts
Crop Data Marts
Extension Data Marts
Complex Query &
Multidimensional
Analysis
Statistical Analysis
Executive
Information
System Feed
Data Mining
EIS/DSS
Query tools
Data Mining
Operational Systems-Weekly refresh each Data marts
Conceptual Framework on
Single Departmental Data
Mart Architecture based
Success Rate is High
ICAR - Central
Harvester to
harvest the
metadata of
the OA
repositories
Library Data Marts
Mata Data
BI Systems
62. •Agri-Search Engine should be developed in
country to aggregate information fromthe
internet and provide it to farmers in
meaningful mannerthrough using ICT tools.
•Agri-Search Engine be coordinated with
Govt. of India’s Agricultural Websites to
monitoreach website perday.
my outlook Country should have agri-search engine
63. SOUNONG (www.sounong.net) Language Chinese
Search engine monitors over 7000
websites perday within China to
collect agriculture data.
SOUNONG collects data
automatically by using soft robots
(web crawler, web spider, automatic indexers).
Source : Institute of Intelligent Machines,
2010
64.
65. A global event, now in its 6th year, promoting Open Access as a
new normin scholarship and research - October21-27, 2013.
Indian Institute of Management and Commerce, Hyderabad
( India) jointly with Knowledge Connect forEmpowerment
(A registered Society) is organising a ONE DAY PROGRAMON
27th
October2013 at HYDERABAD, INDIA.
Events..
66. Entry Door
Door Shower
Door
Door
Fire ExtinguishersFire Extinguishers
Area for Battery Backup
room
Fire ExtinguishersFire Extinguishers
Area for Shoes
Rack/Bags
Area for Almirah,
Book Almirah
SplitAC-1
DEPSUN/AKMU
Exhausts-1,2
Split AC-
3
Exhausts-3
Server Room
-Switch, Routers Rack
LAN Server
-AV Server
-Matadata Server
-MIS/FMS Server
Server Backup
Room
-Repair Room
-Backup room for
Servers
Suggested Layout Diagram
of AKMU
UsersArea
UsersArea
Users AreaUsers Area
No Entry
Zone
Proposed Layout DiagramforAKMU-DMR
SplitAC-2
EnvironmentEnvironment
Recommended Server RoomRecommended Server Room
temperature and Humiditytemperature and Humidity
Low end Temperature – 18Low end Temperature – 18oo
CC
High End Temperature – 27High End Temperature – 27oo
CC
Low end moisture – 5.5Low end moisture – 5.5oo
C (40%RH)C (40%RH)
High end moisture - 15High end moisture - 15oo
C (55%RH)C (55%RH)
Protecting Computer Workers FormProtecting Computer Workers Form
Possible EMF & Toxic Fumes HealthPossible EMF & Toxic Fumes Health
RisksRisks
ELECTROMEGNETIC RADIATIONELECTROMEGNETIC RADIATION
•ionizing-X-rayionizing-X-ray
•Non ionizing- UV, Infrared,Non ionizing- UV, Infrared,
Microwave, Radio frequencies-Microwave, Radio frequencies-
VLF, ELF, Static Magnetic Fields-VLF, ELF, Static Magnetic Fields-
Including DC ElectricityIncluding DC Electricity
Source: carbondescent.org.uk-Source: carbondescent.org.uk-
(Linton Hartifield, June 07, 2011)(Linton Hartifield, June 07, 2011)
FireExtinguishersFireExtinguishers
FireExtinguishersFireExtinguishers
67. Resources
social bookmarking e.g. Delicious:
http://delicious.com/
picture sharing e.g. Flickr: http://www.flickr.com/
bibliographic record sharing e.g. cite-u-like:
http://www.citeulike.org/
sharing catalogue records e.g. librarything:
http://www.librarything.com/ and
video sharing e.g. YouTube:
http://www.youtube.com/
Social Networking e.g. Facebook-
http://www.facebook.com
68. Resources …cont’d
• SPARC institutional repository checklist
& resource guide. Release 1.0, November
2002.
http://www.arl.org/sparc
• Open Society Institute. A guide to
institutional repository software. 2nd
Edition. January 2004.
http://www.soros.org/openaccess/softwar
e
• Open Archives Initiative (OAI).
69.
70. If Facebookwere a Country, It would be 3rd
largest
A- Ripple Effect
It begins with a drop
And then you see the powerof the ripple effect, as each small drop radiates out
all 360 degree outwards, onwards. This is the philosophy we believe in when we
develop solutions: inspiration in a drop that becomes a wave of tangible, real-
world solutions that poweryourbusiness.
Little drops make a mighty ocean
Who we are-we the team
73. The data warehouse is to provide systematic and periodic
information to research scientists, planners, decision makers
and developmental agencies via OLAPand decision support
systems. Specifically, the warehouse is expected to satisfy the
following goals:
• support agricultural research, management and education,
• improve the quality of research and planning,
• reduce duplication of research efforts,
• encourage dissemination of research findings,
• facilitate qualitative research supported by agricultural databases,
• help in the development of Decision Support Systems (DSS),
• use as effective tool foragricultural research and education planning,
• develop effective linkages with othernational and international
organizations.
74.
75.
76. Microsoft's cloud platform, Windows Azure, is a little more
than a year old and is still gathering momentum. Azure has
blossomed into more than just a development play—it's a
full-fledged cloud services operating system that also offers
service hosting and service management.
Google has made a name for itself with its Google Apps
suite of business and consumer cloud applications and its
Google App Engine, the developer platform that lets users
build and host Web apps in the cloud in an effortless
fashion.
Not a platform in the traditional sense, Amazon's AWS Elastic
Beanstalk changes how developers push their apps into
Amazon's cloud. Developers upload the app and Elastic
Beanstalk handles the deployment details, capacity
provisioning, load balancing, auto-scaling and app health
monitoring.
Platform as a Service (PaaS) Vendors
There are many cloud platforms to choose from, all of which in one way or
another help developers build and deploy their applications to the cloud.
77. Creative Common License
Creative Common Corporation – A Non Profit Organisation
Released CCL in 2002.
Presently this organisation is based at the Standford university
of Law School
CCL provides Developers or Creators to obtain copyright
permissions for their creative work
They have provided strong foundation for licensing open
source use of aspects like Text, Music, Web sites and films.
The CCL is an ideal model for those who wish to draft their
own Open Source Licenses.
They also offer Founder’s Copyright for the creator of the work
to hold the work for 14 years, along with renewal option.
Copyright means ‘all rights reserved’
Some right reserved allow to change copyright.