SlideShare ist ein Scribd-Unternehmen logo
1 von 32
FILETABLE AND SEMANTIC SEARCH IN SQL
SERVER 2012
Michael Rys
Principal Program Manager
Microsoft Corp
@SQLServerMike


© 2012 Microsoft
MY FAVORITE BEYOND RELATIONAL APPLICATION


                               Structured and
                               unstructured Search




                               Related/”Semantic”
                               Search
BEYOND RELATIONAL DATA

              Building and Maintaining Applications with
              relational and non-relational data is hard
  Pain           Complex integration
                 Duplicated functionality
 Points          Compensation for unavailable services




              Reduce the cost of managing all data
              Simplify the development of applications
 Goals        over all data
              Provide management and programming
              services for all data
RICH UNSTRUCTURED DATA IN SQL SERVER 2012

• 80% of all data is not stored in databases!
  Most of it is “unstructured”

• Make SQL Server the preferred choice for managing Unstructured Data
  and allow building Rich Application Experience on top

• Address important customer requests for Capabilities and rich services
  for Rich Unstructured Data (RUDS)
    o Scale Up for storage and search to 100mio to 500mio documents
    o Easy use/access to Unstructured data from all applications
    o Rich insight into unstructured data to make better decisions
DEMO
Teaser: MySemanticSearch
http://mysemanticsearch.codeplex.com
RICH UNSTRUCTURED DATA & SERVICES ECOSYSTEM

                                             Transactional Access                     Streaming Win32 Access
                                                                                     Streaming Win32 Access??
                                             Database Applications                  Windows Apps           SQL Apps


                                                                        Blobs            SMB Share         FileStream
                                                                                        Files/Folders          API

                       Rich Services

   Fulltext Search                                    Database




                                                                                                               Solutions
                                                                                                               Scale-up
 Semantic Similarity
                                                    FileTable
                                                                                                                               Disk1   Disk2   Disk3



                                                                                 FileStreams
      Search
                                                                                                                            Multiple Containers


                                       Integrated Administration?
                                        Integrated Administration                    Remote BLOB Storage
                                                                                 Customer Application
                                                                                             SQL RBS API
                                        DB                                                       Centera   SQL FILESTREAM
                                               DB   FileStre                    Azure lib          lib            lib
                                                          FileStreams




                                       Integrated                               Azure            Centera        SQL DB
                               Backup/Replication/AlwaysOn
DEMO
Integrated Management of documents in SQL Server 2012
FILETABLE OVERVIEW

• FileTable: A Table of Files/Directories                                          FileTable Folder Hierarchy
   • User created Table with a fixed schema
   • contains FILESTREAM and File Attributes           FILESTREAM Share
                                                                                          MSSQLSERVER

   • Each row represents a File or a Directory
                                                                                                                            my_machineMSSQLSERVER
   • System defined constraints maintain the tree      Database
                                                                                                                            Office DocsDocuments

     integrity                                         Directories

                                                                           Private Docs                    Office Docs
                                                                           (Database1)                    (Database2)

• File/Directory hierarchy view through a Windows
  Share                                                FileTable Directories

                                                                                              Media           Documents        LogFiles
   • Supports Win32 APIs for File/Directory                                                 (FileTable)       (FileTable)     (FileTable)


     Management                                         User-Defined

   • DB Storage is Transparent to Win32 applications
                                                        Directory Structure



   • SMB level of application compatibility
   • Virtual network name (VNN) path support for
     transparent Win32 application failover
CREATING A FILETABLE

  Pre-requisites
      Enable FILESTREAM
      Create FILESTREAM Share and Filegroup
      Enable non-transactional access at the DB level
       ALTER DATABASE Contoso SET FILESTREAM( non_transacted_access=FULL,
         Directory_name = N’Contoso’)


  Create FileTable
 CREATE TABLE Contoso..Documents AS FILETABLE
       WITH (filetable_directory = N'Document Library')
   Access at   <machine name><FILESTREAM share>ContosoDocument Library
MODIFYING A FILETABLE

  FileTable has a fixed schema
     Columns, system defined constraints cannot be altered/dropped
     Allows user defined indexes/constraints/triggers
  Disabling/Enabling FileTable Namespace
     ALTER TABLE Documents DISABLE FILETABLE_NAMESPACE
     Disables all system-defined constraints and Win32 access to
     FileTable
     Useful for bulk-loading/re-organization of data
  FileTable can be dropped similar to any other table
  Catalog views can be used for obtaining metadata
DATA ACCESS – FILE SYSTEM ACCESS
  FileTable hierarchy is visible through Filestream share
 machine<FILESTREAMshare><Database_directory><FileTable_Directory>...
       Provides transparent Win32 API & File/Directory Management capabilities
           e.g. MS word can create/open/save files; xcopy for copying directory trees into
           database..

  Win32 API operations are non-transactional
       Operations cannot be part of any user transactions
       Win32 operations are intercepted by SQL Server at the File system level
           e.g. File/Directory creation/deletion => insert/delete into FileTable
       Full locking/concurrency semantics with other accesses
       Allows in-place update of file stream data/File attributes

  Transactional FILESTREAM APIs can also be used.
DATA ACCESS – T-SQL ACCESS

  Normal Insert/Update/Delete allowed for the FileTable manipulation
     FileTable Namespace integrity constraints enforced
     Set based operations on the File-attributes – value add

  Built-in functions
     GetFileNamespacePath() – UNC path for a file/directory
     FileTableRootPath() – UNC path to the FileTable root
     GetPathlocator() – path_locator value for a file/directory

  DDL/DML Triggers are supported
     DML triggers on a FileTable cannot update any FileTables
MANAGING FILETABLE

  DB Backup/Restore operations include FileTable data
     Point in time Restore‟ may contain more recent FILESTREAM data due to
     non-transactional updates during backup
  FileTables are secured similar to any other user tables
     Same security is enforced for Win32 access also
  Data Loading
     Windows tools like xcopy/robocopy OR drag-drop operations through
     Windows Explorer can be used
     BCP operations are supported for direct T-SQL data inserts
  SSMS supports FileTable creation/exploration
MANAGING FILETABLE – HIGH AVAILABILITY

SQL Server 2012 AlwaysOn is fully supported

   Transparent data failover
      FileTables can be configured with multiple secondary nodes
      Both sync and async data replication is supported
      File and metadata is available in the secondary in case of failover
   Transparent application failover
      Virtual network name (VNN) path support for transparent Win32 application failover
      Applications use VNNSharedb... Path
      Applications are automatically redirected to the secondary in case of failover
   Restrictions
      FileTables cannot participate in “Read-only” replicas.
FILETABLE RESTRICTIONS

  FileTables cannot be partitioned
  Merge/Transactional replications are not supported
  RCSI/SnapShot isolation mode
        Applications cannot modify file stream data in FileTables
  Win32 Application compatibility
        Memory mapped files, Directory notifications, links are not supported
UNSTRUCTURED DATA SCALE-UP
MULTIPLE CONTAINERS FOR FILESTREAM DATA
   SQL 2008 R2
      Only one storage container/FILESTREAM filegroup
      Limits storage capacity scaling and I/O scaling

   SQL Server 2012
      Support for multiple storage containers/filegroup.
          DDL Changes to Create/Alter Database statements
          Ability to set max_size for the containers
          DBCC Shrinkfile Emptyfile support
      Scaling Flexibility
          Storage scaling by adding additional storage drives
          I/O scaling with multiple spindles
UNSTRUCTURED DATA : MULTIPLE CONTAINERS




  Use of multiple spindles for achieving better I/O Scalability
RUDS SCALE-UP: FILESTREAM PERF/SCALE
 Improved performance of T-SQL and File I/O access
  Various enhancements to improve read/write throughput
    5 fold increase in Read throughput
    Linear scaling with large number of concurrent threads




                                         2012                2012
SUMMARY: FILETABLE

  Application Compatibility for Windows Applications
    Windows applications run on top of files stored in FileTables with
    no modifications
  Relational Value Proposition
    Provide Integrated Administration and Services
       Backup, Log Shipping, HA-DR, Full text and Semantic search, …
    T-SQL orthogonality
       File/Folder attributes surfaced through relational columns
       Power of set based operations, Policy Management, Reporting etc
    FileNamespace Hierarchy management
FULL TEXT SEARCH IMPROVEMENTS IN SQL SERVER 2012
    Improved Performance and Scale:
      Scale-up to 350M documents
      iFTS query perf 7-10 times faster than in SQL Server 2008
      Worst-case iFTS query response times < 3 sec for corpus
      At par or better than main database search competitors
   New Functionality:
      Property Search
      customizable NEAR
      New Wordbrakers: update existing WB, add Czech and Greek
   Innovation in Search:
      Semantic Similarity Search
FULLTEXT SEARCH PERFORMANCE & SCALE IMPROVEMENTS
    Architectural Improvements
       Improved internal implementation
       Queries no longer block Index updates
       Improved Query Plans:
           Better Plans for common queries
           Fulltext predicate folding
           Parallel Plan execution


    Index and Query tested on scale up to 350Million documents with
    <~2 Sec Response
    ~3X better w/o DML and ~9X better with DML throughput
    Scale easily with increasing number of connections
SCALE-UP: FULL-TEXT SEARCH
                                                                    2005/8 vs 2012




                                                                                                2005/8
                                                                                                2012




Queries over 350M documents database and random DMLs running in background.
Beating SQL Server 2005 with a scale factor more than 2x and with avg 60x times better throughput
SCALE-UP: FULL-TEXT SEARCH
                                                                2005/8 vs 2012




                                                                         2005/8


                                                                         2012




Query avgExecTime (ms) under various number of connections (50 ~ 2000 users) for customer
playback benchmark
FULLTEXT PROPERTY SCOPED SEARCH
New Search Filter for Document Properties
      CONTAINS (PROPERTY ( { column_name }, 'property_name' ), „contains_search_condition‟ )
• Setup once per database instance to load the office filters
         exec sp_fulltext_service 'load_os_resources',1
         go
         exec sp_fulltext_service 'restart_all_fdhosts'
         go
• Create a property list
         CREATE SEARCH PROPERTY LIST p1;

• Add properties to be extracted
         ALTER SEARCH PROPERTY LIST [p1] ADD N'System.Author' WITH
           (PROPERTY_SET_GUID = 'f29f85e0-4ff9-1068-ab91-08002b27b3d9',
           PROPERTY_INT_ID = 4, PROPERTY_DESCRIPTION = N'System.Author');

• Create/Alter Fulltext index to specify property list to be extracted
         ALTER FULLTEXT INDEX ON fttable... SET SEARCH PROPERTY LIST = [p1];

• Query for properties
         SELECT * FROM fttable WHERE CONTAINS(PROPERTY(ftcol, 'System.Author'), 'fernlope');
FULL-TEXT CUSTOMIZABLE NEAR
OLD NEAR SYNTAX
select * from fttable where contains(*, 'test near Space')

NEW NEAR USAGES
• SPECIFY DISTANCE
        select * from fttable
        where contains(*, 'near((test, Space), 5,false)')

• REDUCE DISTANCE
        select * from fttable
        where contains(*, 'near((test, Space), 2,false)')

• ORDER OF WORDS IS SPECIFIED AS IMPORTANT
        select * from fttable
        where contains(*, 'near((test, Space), 5,true)')
STATISTICAL SEMANTIC SEARCH
  Semantic Insight into textual content
     Uses language models to find most important keywords in document
         No need to build brittle ontologies!
     Statistically Prominent Keywords
         Autogenerated tag clouds
     Potentially Related Content based on extracted Keywords, such as
         Similar Products (based on description)
         Similar Jobs or Applicants
         Similar Support Incidents (based on call logs)
         Potential Solutions (based on similar incidents)

  First class usage experience
     Efficent linear algorithms
     Integrated with FTS and SQL
         New Rowset functions for all results using SQL query
DEMO
Semantic Extraction and Relationships
FullText Search in SQL Server 2012
SEMANTIC SIMILARITY
 • Input: Text such as varchar, Office, PDF, HTML, email…
   Output: Rowset functions with standard SQL queries
      Illustrating example:
     Source Table                                                                                             Keyphrases              KeyphraseDocuments
                                                                                      --------------
      Key           Title                                   Document                  --------------           ID   Keyword           ID                   DocID
      D1            Annual Budget                           …                         --------------
                                                                                    --------------
                                                                                      --------------           T1   revenue           T1 (revenue)         D1 (Annual Budget)
      D2            Corporate Earnings                      …                       --------------
                                                                                      --------------
                                                                                  --------------
                                                                                    --------------             T2   growth            T2 (growth)          D2 (Corporate Earnings)
      D3            Marketing Reports                       …                     --------------
                                                                                    --------------             T3   Windows           T3 (Windows)         D3 (Marketing Reports)
                                                                                  --------------
                                                                                    --------------
      …             …                                       …                                                  T4   Azure
                                                                                  --------------                                      …                    …
                                                                                  --------------               …    …                 T1 (revenue)         D7 (Finance Report)
                                                                1                                                                     …                    …
                Full-Text and Semantic Processing                                                                                     T3 (Windows)         D11 (Azure Strategy)

                                         quarter, record,
                                                                                                                                      T4 (Azure)           D11 (Azure Strategy)
                                         revenue…




                                                                                                                                                           3
                                                                                                                              DocumentSimilarity
                                                                2
                                                                a



Keyword Index (Full-Text)                                                                                                     DocID                        MatchedDocID
ID        Keyword       Colid   …                compDocid             CompOc                   CompPid                       D1 (Annual Budget)           D2 (Corporate Earnings)
K1        revenue       1       …                10,23,123             (1,4),(5,8),(1,34)       2,5,6,8,4,3                   D1 (Annual Budget)           D7 (Finance Report)
K2        growth        1       …                10,23,123             (1,5),(5,9),(1,34)       2,5,6,8,5,4                   D3 (Marketing Reports)       D11 (Azure Strategy)
          …             …       …                …                     …                        …                             …                            …
SEMANTIC EXTRACTION: END-2-END EXPERIENCE

• Downloadable Language Statistical Database with registration stored
  procedure
• Setup along with Full-Text
• Metadata / Catalog views
• System level DMVs for progress state and usage
• Manageability through SSMS and SMO
KEY TAKEAWAYS

  SQL Server‟s unstructured data support is key strategy to
  enable you to build complex data applications that go
  beyond relational data!
    Content and Collaboration, eDiscovery, Healthcare, Document
    management etc.
RELATED CONTENT

 SQL Server 2012 Whitepapers and information:
   http://www.sqlserverlaunch.com
 Channel 9 DataBound Episode 2: http://channel9.msdn.com
 MySemanticsSearch Demo: http://mysemanticsearch.codeplex.com
 More demo data sets and demo scripts:
 http://blogs.msdn.com/b/sqlfts/archive/2011/07/21/introducing-fulltext-
 statistical-semantic-search-in-sql-server-codename-denali-release.aspx
 Microsoft Virtual Academy Recording: Coming Soon!
FileTable and Semantic Search in SQL Server 2012

Weitere ähnliche Inhalte

Was ist angesagt?

Oracle Database Introduction
Oracle Database IntroductionOracle Database Introduction
Oracle Database IntroductionChhom Karath
 
Introduction to oracle database (basic concepts)
Introduction to oracle database (basic concepts)Introduction to oracle database (basic concepts)
Introduction to oracle database (basic concepts)Bilal Arshad
 
Oracle Database | Computer Science
Oracle Database | Computer ScienceOracle Database | Computer Science
Oracle Database | Computer ScienceTransweb Global Inc
 
Introduction to Oracle Database
Introduction to Oracle DatabaseIntroduction to Oracle Database
Introduction to Oracle Databasepuja_dhar
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 OverviewDavid Chou
 
Oracle dba trainining in hyderabad
Oracle dba trainining in hyderabadOracle dba trainining in hyderabad
Oracle dba trainining in hyderabadsreehari orienit
 
Active directory
Active directory Active directory
Active directory deshvikas
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architectureAjeet Singh
 
Basic oracle-database-administration
Basic oracle-database-administrationBasic oracle-database-administration
Basic oracle-database-administrationsreehari orienit
 
Summer training oracle
Summer training   oracle Summer training   oracle
Summer training oracle Arshit Rai
 
Database Architecture and Basic Concepts
Database Architecture and Basic ConceptsDatabase Architecture and Basic Concepts
Database Architecture and Basic ConceptsTony Wong
 
Inb343 week2 sql server intro
Inb343 week2 sql server introInb343 week2 sql server intro
Inb343 week2 sql server introFredlive503
 
Oracle dba interview
Oracle dba interviewOracle dba interview
Oracle dba interviewNaveen P
 

Was ist angesagt? (20)

Oracle Database Introduction
Oracle Database IntroductionOracle Database Introduction
Oracle Database Introduction
 
Oracle archi ppt
Oracle archi pptOracle archi ppt
Oracle archi ppt
 
Oracle DB
Oracle DBOracle DB
Oracle DB
 
Sql server basics
Sql server basicsSql server basics
Sql server basics
 
ora_sothea
ora_sotheaora_sothea
ora_sothea
 
Introduction to oracle database (basic concepts)
Introduction to oracle database (basic concepts)Introduction to oracle database (basic concepts)
Introduction to oracle database (basic concepts)
 
Oracle Database | Computer Science
Oracle Database | Computer ScienceOracle Database | Computer Science
Oracle Database | Computer Science
 
Introduction to Oracle Database
Introduction to Oracle DatabaseIntroduction to Oracle Database
Introduction to Oracle Database
 
Oracle DBA
Oracle DBAOracle DBA
Oracle DBA
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 Overview
 
MS SQL Server
MS SQL ServerMS SQL Server
MS SQL Server
 
IBM DB2
IBM DB2IBM DB2
IBM DB2
 
Oracle dba trainining in hyderabad
Oracle dba trainining in hyderabadOracle dba trainining in hyderabad
Oracle dba trainining in hyderabad
 
Active directory
Active directory Active directory
Active directory
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
 
Basic oracle-database-administration
Basic oracle-database-administrationBasic oracle-database-administration
Basic oracle-database-administration
 
Summer training oracle
Summer training   oracle Summer training   oracle
Summer training oracle
 
Database Architecture and Basic Concepts
Database Architecture and Basic ConceptsDatabase Architecture and Basic Concepts
Database Architecture and Basic Concepts
 
Inb343 week2 sql server intro
Inb343 week2 sql server introInb343 week2 sql server intro
Inb343 week2 sql server intro
 
Oracle dba interview
Oracle dba interviewOracle dba interview
Oracle dba interview
 

Andere mochten auch

Sql Saturday 111 Atlanta applied enterprise semantic mining
Sql Saturday 111 Atlanta applied enterprise semantic miningSql Saturday 111 Atlanta applied enterprise semantic mining
Sql Saturday 111 Atlanta applied enterprise semantic miningMark Tabladillo
 
Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310Mark Tabladillo
 
Sql 2012 development and programming
Sql 2012  development and programmingSql 2012  development and programming
Sql 2012 development and programmingLearnNowOnline
 
Applied Semantic Search with Microsoft SQL Server
Applied Semantic Search with Microsoft SQL ServerApplied Semantic Search with Microsoft SQL Server
Applied Semantic Search with Microsoft SQL ServerMark Tabladillo
 
SQL Server - Full text search
SQL Server - Full text searchSQL Server - Full text search
SQL Server - Full text searchPeter Gfader
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLTaming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLMichael Rys
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Michael Rys
 
Effective Usage of SQL Server 2005 Database Mirroring
Effective Usage of SQL Server 2005 Database MirroringEffective Usage of SQL Server 2005 Database Mirroring
Effective Usage of SQL Server 2005 Database Mirroringwebhostingguy
 
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)Michael Rys
 
SQL Server Performance Tuning Baseline
SQL Server Performance Tuning BaselineSQL Server Performance Tuning Baseline
SQL Server Performance Tuning Baseline► Supreme Mandal ◄
 
Sql Server Performance Tuning
Sql Server Performance TuningSql Server Performance Tuning
Sql Server Performance TuningBala Subra
 
SQL Server - Querying and Managing XML Data
SQL Server - Querying and Managing XML DataSQL Server - Querying and Managing XML Data
SQL Server - Querying and Managing XML DataMarek Maśko
 
Always on in SQL Server 2012
Always on in SQL Server 2012Always on in SQL Server 2012
Always on in SQL Server 2012Fadi Abdulwahab
 
How to launch an aws ec2 instance
How to launch an aws ec2 instanceHow to launch an aws ec2 instance
How to launch an aws ec2 instanceAndrea Cirillo
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016James Serra
 

Andere mochten auch (18)

Sql Saturday 111 Atlanta applied enterprise semantic mining
Sql Saturday 111 Atlanta applied enterprise semantic miningSql Saturday 111 Atlanta applied enterprise semantic mining
Sql Saturday 111 Atlanta applied enterprise semantic mining
 
Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310
 
Sql 2012 development and programming
Sql 2012  development and programmingSql 2012  development and programming
Sql 2012 development and programming
 
Understanding indices
Understanding indicesUnderstanding indices
Understanding indices
 
Applied Semantic Search with Microsoft SQL Server
Applied Semantic Search with Microsoft SQL ServerApplied Semantic Search with Microsoft SQL Server
Applied Semantic Search with Microsoft SQL Server
 
SQL Server - Full text search
SQL Server - Full text searchSQL Server - Full text search
SQL Server - Full text search
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLTaming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)
 
Effective Usage of SQL Server 2005 Database Mirroring
Effective Usage of SQL Server 2005 Database MirroringEffective Usage of SQL Server 2005 Database Mirroring
Effective Usage of SQL Server 2005 Database Mirroring
 
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
Tuning and Optimizing U-SQL Queries (SQLPASS 2016)
 
SQL Server Performance Tuning Baseline
SQL Server Performance Tuning BaselineSQL Server Performance Tuning Baseline
SQL Server Performance Tuning Baseline
 
Sql Server Performance Tuning
Sql Server Performance TuningSql Server Performance Tuning
Sql Server Performance Tuning
 
SQL Server - Querying and Managing XML Data
SQL Server - Querying and Managing XML DataSQL Server - Querying and Managing XML Data
SQL Server - Querying and Managing XML Data
 
Always on in SQL Server 2012
Always on in SQL Server 2012Always on in SQL Server 2012
Always on in SQL Server 2012
 
File Upload
File UploadFile Upload
File Upload
 
How to launch an aws ec2 instance
How to launch an aws ec2 instanceHow to launch an aws ec2 instance
How to launch an aws ec2 instance
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016
 
Implementing Full Text in SQL Server
Implementing Full Text in SQL ServerImplementing Full Text in SQL Server
Implementing Full Text in SQL Server
 

Ähnlich wie FileTable and Semantic Search in SQL Server 2012

SQLBits X SQL Server 2012 Rich Unstructured Data
SQLBits X SQL Server 2012 Rich Unstructured DataSQLBits X SQL Server 2012 Rich Unstructured Data
SQLBits X SQL Server 2012 Rich Unstructured DataMichael Rys
 
SQL Server 2012 Beyond Relational Performance and Scale
SQL Server 2012 Beyond Relational Performance and ScaleSQL Server 2012 Beyond Relational Performance and Scale
SQL Server 2012 Beyond Relational Performance and ScaleMichael Rys
 
Dipping Your Toes: Azure Data Lake for DBAs
Dipping Your Toes: Azure Data Lake for DBAsDipping Your Toes: Azure Data Lake for DBAs
Dipping Your Toes: Azure Data Lake for DBAsBob Pusateri
 
Relational
RelationalRelational
Relationaldieover
 
The Efficient Use of Cyberinfrastructure to Enable Data Analysis Collaboration
The Efficient Use of Cyberinfrastructure  to Enable Data Analysis CollaborationThe Efficient Use of Cyberinfrastructure  to Enable Data Analysis Collaboration
The Efficient Use of Cyberinfrastructure to Enable Data Analysis CollaborationCybera Inc.
 
Database management system
Database management systemDatabase management system
Database management systemRizwanHafeez
 
1.2 active directory
1.2 active directory1.2 active directory
1.2 active directoryMuuluu
 
DATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptx
DATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptxDATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptx
DATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptxLaxmi Pandya
 
Active Directory Services
Active Directory ServicesActive Directory Services
Active Directory ServicesVarun Arora
 
The Object Evolution - EMC Object-Based Storage for Active Archiving and Appl...
The Object Evolution - EMC Object-Based Storage for Active Archiving and Appl...The Object Evolution - EMC Object-Based Storage for Active Archiving and Appl...
The Object Evolution - EMC Object-Based Storage for Active Archiving and Appl...EMC
 

Ähnlich wie FileTable and Semantic Search in SQL Server 2012 (20)

SQLBits X SQL Server 2012 Rich Unstructured Data
SQLBits X SQL Server 2012 Rich Unstructured DataSQLBits X SQL Server 2012 Rich Unstructured Data
SQLBits X SQL Server 2012 Rich Unstructured Data
 
SQL Server 2012 Beyond Relational Performance and Scale
SQL Server 2012 Beyond Relational Performance and ScaleSQL Server 2012 Beyond Relational Performance and Scale
SQL Server 2012 Beyond Relational Performance and Scale
 
Dbms9
Dbms9Dbms9
Dbms9
 
Dipping Your Toes: Azure Data Lake for DBAs
Dipping Your Toes: Azure Data Lake for DBAsDipping Your Toes: Azure Data Lake for DBAs
Dipping Your Toes: Azure Data Lake for DBAs
 
Relational
RelationalRelational
Relational
 
SBS-101 What is SharePoint
SBS-101 What is SharePointSBS-101 What is SharePoint
SBS-101 What is SharePoint
 
The Efficient Use of Cyberinfrastructure to Enable Data Analysis Collaboration
The Efficient Use of Cyberinfrastructure  to Enable Data Analysis CollaborationThe Efficient Use of Cyberinfrastructure  to Enable Data Analysis Collaboration
The Efficient Use of Cyberinfrastructure to Enable Data Analysis Collaboration
 
DC
DCDC
DC
 
ISUG 113: File stream
ISUG 113: File streamISUG 113: File stream
ISUG 113: File stream
 
Dbms mca-section a
Dbms mca-section aDbms mca-section a
Dbms mca-section a
 
Database management system
Database management systemDatabase management system
Database management system
 
1.2 active directory
1.2 active directory1.2 active directory
1.2 active directory
 
Active directory slides
Active directory slidesActive directory slides
Active directory slides
 
Active Directory
Active Directory Active Directory
Active Directory
 
DATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptx
DATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptxDATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptx
DATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptx
 
Databases
DatabasesDatabases
Databases
 
Active Directory Services
Active Directory ServicesActive Directory Services
Active Directory Services
 
Active diirecotry
Active diirecotryActive diirecotry
Active diirecotry
 
The Object Evolution - EMC Object-Based Storage for Active Archiving and Appl...
The Object Evolution - EMC Object-Based Storage for Active Archiving and Appl...The Object Evolution - EMC Object-Based Storage for Active Archiving and Appl...
The Object Evolution - EMC Object-Based Storage for Active Archiving and Appl...
 
DMS
DMSDMS
DMS
 

Mehr von Michael Rys

Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Michael Rys
 
Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Michael Rys
 
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Michael Rys
 
Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...Michael Rys
 
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Michael Rys
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Michael Rys
 
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Michael Rys
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Michael Rys
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Michael Rys
 
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...Michael Rys
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Michael Rys
 
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...Michael Rys
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Michael Rys
 
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...Michael Rys
 
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)Michael Rys
 
Killer Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQLKiller Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQLMichael Rys
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)Michael Rys
 
U-SQL Learning Resources (SQLBits 2016)
U-SQL Learning Resources (SQLBits 2016)U-SQL Learning Resources (SQLBits 2016)
U-SQL Learning Resources (SQLBits 2016)Michael Rys
 
U-SQL Federated Distributed Queries (SQLBits 2016)
U-SQL Federated Distributed Queries (SQLBits 2016)U-SQL Federated Distributed Queries (SQLBits 2016)
U-SQL Federated Distributed Queries (SQLBits 2016)Michael Rys
 
U-SQL Partitioned Data and Tables (SQLBits 2016)
U-SQL Partitioned Data and Tables (SQLBits 2016)U-SQL Partitioned Data and Tables (SQLBits 2016)
U-SQL Partitioned Data and Tables (SQLBits 2016)Michael Rys
 

Mehr von Michael Rys (20)

Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
 
Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)
 
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
Running cost effective big data workloads with Azure Synapse and ADLS (MS Ign...
 
Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...
 
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
 
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
 
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
Modernizing ETL with Azure Data Lake: Hyperscale, multi-format, multi-platfor...
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
 
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
Bring your code to explore the Azure Data Lake: Execute your .NET/Python/R co...
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
 
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
 
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
 
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
 
Killer Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQLKiller Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQL
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)
 
U-SQL Learning Resources (SQLBits 2016)
U-SQL Learning Resources (SQLBits 2016)U-SQL Learning Resources (SQLBits 2016)
U-SQL Learning Resources (SQLBits 2016)
 
U-SQL Federated Distributed Queries (SQLBits 2016)
U-SQL Federated Distributed Queries (SQLBits 2016)U-SQL Federated Distributed Queries (SQLBits 2016)
U-SQL Federated Distributed Queries (SQLBits 2016)
 
U-SQL Partitioned Data and Tables (SQLBits 2016)
U-SQL Partitioned Data and Tables (SQLBits 2016)U-SQL Partitioned Data and Tables (SQLBits 2016)
U-SQL Partitioned Data and Tables (SQLBits 2016)
 

Kürzlich hochgeladen

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Kürzlich hochgeladen (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

FileTable and Semantic Search in SQL Server 2012

  • 1. FILETABLE AND SEMANTIC SEARCH IN SQL SERVER 2012 Michael Rys Principal Program Manager Microsoft Corp @SQLServerMike © 2012 Microsoft
  • 2. MY FAVORITE BEYOND RELATIONAL APPLICATION Structured and unstructured Search Related/”Semantic” Search
  • 3. BEYOND RELATIONAL DATA Building and Maintaining Applications with relational and non-relational data is hard Pain Complex integration Duplicated functionality Points Compensation for unavailable services Reduce the cost of managing all data Simplify the development of applications Goals over all data Provide management and programming services for all data
  • 4. RICH UNSTRUCTURED DATA IN SQL SERVER 2012 • 80% of all data is not stored in databases! Most of it is “unstructured” • Make SQL Server the preferred choice for managing Unstructured Data and allow building Rich Application Experience on top • Address important customer requests for Capabilities and rich services for Rich Unstructured Data (RUDS) o Scale Up for storage and search to 100mio to 500mio documents o Easy use/access to Unstructured data from all applications o Rich insight into unstructured data to make better decisions
  • 6. RICH UNSTRUCTURED DATA & SERVICES ECOSYSTEM Transactional Access Streaming Win32 Access Streaming Win32 Access?? Database Applications Windows Apps SQL Apps Blobs SMB Share FileStream Files/Folders API Rich Services Fulltext Search Database Solutions Scale-up Semantic Similarity FileTable Disk1 Disk2 Disk3 FileStreams Search Multiple Containers Integrated Administration? Integrated Administration Remote BLOB Storage Customer Application SQL RBS API DB Centera SQL FILESTREAM DB FileStre Azure lib lib lib FileStreams Integrated Azure Centera SQL DB Backup/Replication/AlwaysOn
  • 7. DEMO Integrated Management of documents in SQL Server 2012
  • 8. FILETABLE OVERVIEW • FileTable: A Table of Files/Directories FileTable Folder Hierarchy • User created Table with a fixed schema • contains FILESTREAM and File Attributes FILESTREAM Share MSSQLSERVER • Each row represents a File or a Directory my_machineMSSQLSERVER • System defined constraints maintain the tree Database Office DocsDocuments integrity Directories Private Docs Office Docs (Database1) (Database2) • File/Directory hierarchy view through a Windows Share FileTable Directories Media Documents LogFiles • Supports Win32 APIs for File/Directory (FileTable) (FileTable) (FileTable) Management User-Defined • DB Storage is Transparent to Win32 applications Directory Structure • SMB level of application compatibility • Virtual network name (VNN) path support for transparent Win32 application failover
  • 9. CREATING A FILETABLE Pre-requisites Enable FILESTREAM Create FILESTREAM Share and Filegroup Enable non-transactional access at the DB level ALTER DATABASE Contoso SET FILESTREAM( non_transacted_access=FULL, Directory_name = N’Contoso’) Create FileTable CREATE TABLE Contoso..Documents AS FILETABLE WITH (filetable_directory = N'Document Library') Access at <machine name><FILESTREAM share>ContosoDocument Library
  • 10. MODIFYING A FILETABLE FileTable has a fixed schema Columns, system defined constraints cannot be altered/dropped Allows user defined indexes/constraints/triggers Disabling/Enabling FileTable Namespace ALTER TABLE Documents DISABLE FILETABLE_NAMESPACE Disables all system-defined constraints and Win32 access to FileTable Useful for bulk-loading/re-organization of data FileTable can be dropped similar to any other table Catalog views can be used for obtaining metadata
  • 11. DATA ACCESS – FILE SYSTEM ACCESS FileTable hierarchy is visible through Filestream share machine<FILESTREAMshare><Database_directory><FileTable_Directory>... Provides transparent Win32 API & File/Directory Management capabilities e.g. MS word can create/open/save files; xcopy for copying directory trees into database.. Win32 API operations are non-transactional Operations cannot be part of any user transactions Win32 operations are intercepted by SQL Server at the File system level e.g. File/Directory creation/deletion => insert/delete into FileTable Full locking/concurrency semantics with other accesses Allows in-place update of file stream data/File attributes Transactional FILESTREAM APIs can also be used.
  • 12. DATA ACCESS – T-SQL ACCESS Normal Insert/Update/Delete allowed for the FileTable manipulation FileTable Namespace integrity constraints enforced Set based operations on the File-attributes – value add Built-in functions GetFileNamespacePath() – UNC path for a file/directory FileTableRootPath() – UNC path to the FileTable root GetPathlocator() – path_locator value for a file/directory DDL/DML Triggers are supported DML triggers on a FileTable cannot update any FileTables
  • 13. MANAGING FILETABLE DB Backup/Restore operations include FileTable data Point in time Restore‟ may contain more recent FILESTREAM data due to non-transactional updates during backup FileTables are secured similar to any other user tables Same security is enforced for Win32 access also Data Loading Windows tools like xcopy/robocopy OR drag-drop operations through Windows Explorer can be used BCP operations are supported for direct T-SQL data inserts SSMS supports FileTable creation/exploration
  • 14. MANAGING FILETABLE – HIGH AVAILABILITY SQL Server 2012 AlwaysOn is fully supported Transparent data failover FileTables can be configured with multiple secondary nodes Both sync and async data replication is supported File and metadata is available in the secondary in case of failover Transparent application failover Virtual network name (VNN) path support for transparent Win32 application failover Applications use VNNSharedb... Path Applications are automatically redirected to the secondary in case of failover Restrictions FileTables cannot participate in “Read-only” replicas.
  • 15. FILETABLE RESTRICTIONS FileTables cannot be partitioned Merge/Transactional replications are not supported RCSI/SnapShot isolation mode Applications cannot modify file stream data in FileTables Win32 Application compatibility Memory mapped files, Directory notifications, links are not supported
  • 16. UNSTRUCTURED DATA SCALE-UP MULTIPLE CONTAINERS FOR FILESTREAM DATA SQL 2008 R2 Only one storage container/FILESTREAM filegroup Limits storage capacity scaling and I/O scaling SQL Server 2012 Support for multiple storage containers/filegroup. DDL Changes to Create/Alter Database statements Ability to set max_size for the containers DBCC Shrinkfile Emptyfile support Scaling Flexibility Storage scaling by adding additional storage drives I/O scaling with multiple spindles
  • 17. UNSTRUCTURED DATA : MULTIPLE CONTAINERS Use of multiple spindles for achieving better I/O Scalability
  • 18. RUDS SCALE-UP: FILESTREAM PERF/SCALE Improved performance of T-SQL and File I/O access Various enhancements to improve read/write throughput 5 fold increase in Read throughput Linear scaling with large number of concurrent threads 2012 2012
  • 19. SUMMARY: FILETABLE Application Compatibility for Windows Applications Windows applications run on top of files stored in FileTables with no modifications Relational Value Proposition Provide Integrated Administration and Services Backup, Log Shipping, HA-DR, Full text and Semantic search, … T-SQL orthogonality File/Folder attributes surfaced through relational columns Power of set based operations, Policy Management, Reporting etc FileNamespace Hierarchy management
  • 20. FULL TEXT SEARCH IMPROVEMENTS IN SQL SERVER 2012 Improved Performance and Scale: Scale-up to 350M documents iFTS query perf 7-10 times faster than in SQL Server 2008 Worst-case iFTS query response times < 3 sec for corpus At par or better than main database search competitors New Functionality: Property Search customizable NEAR New Wordbrakers: update existing WB, add Czech and Greek Innovation in Search: Semantic Similarity Search
  • 21. FULLTEXT SEARCH PERFORMANCE & SCALE IMPROVEMENTS Architectural Improvements Improved internal implementation Queries no longer block Index updates Improved Query Plans: Better Plans for common queries Fulltext predicate folding Parallel Plan execution Index and Query tested on scale up to 350Million documents with <~2 Sec Response ~3X better w/o DML and ~9X better with DML throughput Scale easily with increasing number of connections
  • 22. SCALE-UP: FULL-TEXT SEARCH 2005/8 vs 2012 2005/8 2012 Queries over 350M documents database and random DMLs running in background. Beating SQL Server 2005 with a scale factor more than 2x and with avg 60x times better throughput
  • 23. SCALE-UP: FULL-TEXT SEARCH 2005/8 vs 2012 2005/8 2012 Query avgExecTime (ms) under various number of connections (50 ~ 2000 users) for customer playback benchmark
  • 24. FULLTEXT PROPERTY SCOPED SEARCH New Search Filter for Document Properties CONTAINS (PROPERTY ( { column_name }, 'property_name' ), „contains_search_condition‟ ) • Setup once per database instance to load the office filters exec sp_fulltext_service 'load_os_resources',1 go exec sp_fulltext_service 'restart_all_fdhosts' go • Create a property list CREATE SEARCH PROPERTY LIST p1; • Add properties to be extracted ALTER SEARCH PROPERTY LIST [p1] ADD N'System.Author' WITH (PROPERTY_SET_GUID = 'f29f85e0-4ff9-1068-ab91-08002b27b3d9', PROPERTY_INT_ID = 4, PROPERTY_DESCRIPTION = N'System.Author'); • Create/Alter Fulltext index to specify property list to be extracted ALTER FULLTEXT INDEX ON fttable... SET SEARCH PROPERTY LIST = [p1]; • Query for properties SELECT * FROM fttable WHERE CONTAINS(PROPERTY(ftcol, 'System.Author'), 'fernlope');
  • 25. FULL-TEXT CUSTOMIZABLE NEAR OLD NEAR SYNTAX select * from fttable where contains(*, 'test near Space') NEW NEAR USAGES • SPECIFY DISTANCE select * from fttable where contains(*, 'near((test, Space), 5,false)') • REDUCE DISTANCE select * from fttable where contains(*, 'near((test, Space), 2,false)') • ORDER OF WORDS IS SPECIFIED AS IMPORTANT select * from fttable where contains(*, 'near((test, Space), 5,true)')
  • 26. STATISTICAL SEMANTIC SEARCH Semantic Insight into textual content Uses language models to find most important keywords in document No need to build brittle ontologies! Statistically Prominent Keywords Autogenerated tag clouds Potentially Related Content based on extracted Keywords, such as Similar Products (based on description) Similar Jobs or Applicants Similar Support Incidents (based on call logs) Potential Solutions (based on similar incidents) First class usage experience Efficent linear algorithms Integrated with FTS and SQL New Rowset functions for all results using SQL query
  • 27. DEMO Semantic Extraction and Relationships FullText Search in SQL Server 2012
  • 28. SEMANTIC SIMILARITY • Input: Text such as varchar, Office, PDF, HTML, email… Output: Rowset functions with standard SQL queries Illustrating example: Source Table Keyphrases KeyphraseDocuments -------------- Key Title Document -------------- ID Keyword ID DocID D1 Annual Budget … -------------- -------------- -------------- T1 revenue T1 (revenue) D1 (Annual Budget) D2 Corporate Earnings … -------------- -------------- -------------- -------------- T2 growth T2 (growth) D2 (Corporate Earnings) D3 Marketing Reports … -------------- -------------- T3 Windows T3 (Windows) D3 (Marketing Reports) -------------- -------------- … … … T4 Azure -------------- … … -------------- … … T1 (revenue) D7 (Finance Report) 1 … … Full-Text and Semantic Processing T3 (Windows) D11 (Azure Strategy) quarter, record, T4 (Azure) D11 (Azure Strategy) revenue… 3 DocumentSimilarity 2 a Keyword Index (Full-Text) DocID MatchedDocID ID Keyword Colid … compDocid CompOc CompPid D1 (Annual Budget) D2 (Corporate Earnings) K1 revenue 1 … 10,23,123 (1,4),(5,8),(1,34) 2,5,6,8,4,3 D1 (Annual Budget) D7 (Finance Report) K2 growth 1 … 10,23,123 (1,5),(5,9),(1,34) 2,5,6,8,5,4 D3 (Marketing Reports) D11 (Azure Strategy) … … … … … … … …
  • 29. SEMANTIC EXTRACTION: END-2-END EXPERIENCE • Downloadable Language Statistical Database with registration stored procedure • Setup along with Full-Text • Metadata / Catalog views • System level DMVs for progress state and usage • Manageability through SSMS and SMO
  • 30. KEY TAKEAWAYS SQL Server‟s unstructured data support is key strategy to enable you to build complex data applications that go beyond relational data! Content and Collaboration, eDiscovery, Healthcare, Document management etc.
  • 31. RELATED CONTENT SQL Server 2012 Whitepapers and information: http://www.sqlserverlaunch.com Channel 9 DataBound Episode 2: http://channel9.msdn.com MySemanticsSearch Demo: http://mysemanticsearch.codeplex.com More demo data sets and demo scripts: http://blogs.msdn.com/b/sqlfts/archive/2011/07/21/introducing-fulltext- statistical-semantic-search-in-sql-server-codename-denali-release.aspx Microsoft Virtual Academy Recording: Coming Soon!

Hinweis der Redaktion

  1. Let’s take a look at a BR application. What services does it provide. What about having these services supported in the database instead of each application building their own?
  2. Examples: Manage an application that manages images in the file system and additional information in the databaseBuilding a spatial database application before SQL Server 2008Example services: Backup/restore, search over relational and non-relational data
  3. SQL 2008 provides Filestreams as a way add large blobs/unstructured data streams into SQL and still be able to open a Win32 handle (using SQL API) and provide high streaming performance for the data Win32 Namespace support in SQL Server 2012 has the following goals Reduce the barrier to entry for customers who have data in file servers and have Win32 applications that work on these currently. By enabling Win32 namespace, SQL will generate Windows Share that can be exposed to existing Win32 applications similar to any file server shares. This can allow Win32 applications/mid tier servers (like IIS) to work with this data without having to understand the database/transaction semantics Single integrated set of Admin tools – SQL backup/restore, Replication, HA solutions etc Scale up – Add multiple disks on a machine for storing Filestream data. Use SQL services like Full text search for both FileStream and relational metadata, Property Promotion Infrastructure fro extracting interesting properties from SQL blobs/filestream to surface as relational columns for query
  4. Optimized hot paths, removed unnecessary serialization, expensive FileSystem operations etc