SlideShare ist ein Scribd-Unternehmen logo
1 von 73
RFID Data Management
Kamlesh Laddhad (05329014)
Karthik B.(05329021)
Guide: Prof. Bernard Menezes
Outline
• Introduction to RFID Technology.
• Issues with RFID Technology.
• RFID Data Characteristics.
• Data Warehousing.
– Expressive Temporal Model: Dynamic Relationship ER Model
– RFID - Cuboids.
– Use of Bitmap Datatype.
• Data Cleaning.
– Extensible Sensor stream Processing (ESP)
– Statistical sMoothing for Unreliable RFid data.(SMURF)
• Future Plans.
Introduction
• Radio Frequency Identification:
– It is an Automatic Identification and Data Capture Technology.
– Fast
– No contact or line of sight.
– Uses radio-frequency waves to transfer data
• Components
– Tag: small, low-cost device that can hold a limited amount of data.
• Associated with objects, such as pallets, cases, and even individual items.
– Reader: Recognize presence of tag and read info stored on it.
• Unique electronic product code (EPC) associated with a tag.
• By placing RFID tag readers at various locations, one can track the
movement of objects through supply chain networks.
Applications and Adoptions
• Supply Chain Management: real-time inventory
tracking.
– US Department Of Defense: shipments to armed forces
• Retail: Active shelves monitor product availability
– Wal-Mart, Albertson: Major Retails stores
• Access control: toll collection, transportation.
– Airline luggage management:
• British airways:20 million bags a year
• Implemented to reduce lost/misplaced luggage
• Anti-counterfeiting and security:
– Food and Drug Administration: To reduce counterfeit in
pharmaceutical supply chain
Prospective for RFID research
• The physics of building tags and readers
– Tags have few gates: Apart from basic operation, very less computing power.
– Radio-frequency has some issues with operating in certain physical mediums.
• The privacy and safety issues:
– Complex encryption schemes are not possible on RFID tags.
– Counterfeiting by means of either illegitimate readers or spoofed tags are
possible
– Reader-tag communication is wireless: Third parties can eavesdrop on signals.
• Software Architecture to collect, filter, organize, and answer online
queries:
– No. of tags are proportional to No of items being serviced/tracked.
– No. of readers are proportional to traceable strategic locations/areas
• Each Reader picks up tag signals on continuous basis.
• Data generated by RFID systems is enormous:
• E.g. Wal-Mart is expected to generate 7 terabytes of RFID data per day.
• Our Focus: Third Stream.
Data Warehousing Techniques
Data Management Challenges
• Data Explosion : Example
– A retailer with 3,000 stores, selling 10,000 items a
day per store.
– Each item moves 10 times on average before being
sold
• Movement recorded as (EPC, location, second)
– Data volume: 300 million tuples per day.
– Example OLAP Query: “Average time for items to
move from warehouse to checkout counter in March
2006?”.
• Costly to answer if there are a billion tuples for March
2006.
Data Characteristics
• Temporal and history oriented
– Applications dynamically generate observations (readings).
– Objects location and containment relationship among objects changes
– Need: Expressive data model.
• Inaccurate data and implicit semantics
– False positive: Non-existing tag incorrectly read.
– False Negative: Reader missed a tag which was in its vicinity.
– Noisy data & duplicate readings (redundancy): Same tag read more than
once.
– Need: Automated data filtering and transformation.
• Streaming and large volume
– Object stay in place for longer duration: Readers records them
periodically. Large data keeps generating.
– We need to preserve this data for tracking and monitoring.
– Need: Scalable storage scheme, compression techniques to reduce data.
• Data Granularity
– Data collection granularity needs to be decided
– Differs across applications.
Warehousing Helps!!
• Lossless compression
– Remove redundancy: (r1,l1,t1) (r1,l1,t2) ... (r1,l1,t10) => (r1,l1,t1,t10)
– Group objects that move and stay together.
• Data cleaning: Multi-reading, missed-reading, error-reading, bulky movement.
• Data mining: Find trends, outliers, frequent, sequential, flow patterns.
• Multi-dimensional summary: product, location, time, …
– Store manager: Check item movements from the backroom to different shelves
in his store
– Region manager: Collapse intra-store movements and look at distribution
centers, warehouses, and stores
• Query Processing
– Support for OLAP: roll-up, drill-down, slice, and dice
– Path query: New to RFID-Warehouses, about the structure of paths
• What products that go through quality control have shorter paths?
• What locations are common to the paths of a set of defective auto-parts?
• Identify containers at a port that have deviated from their historic paths
Dynamic Relationship ER Model
• Proposed by Wang and Liu from Siemens.
• RFID entities are static and are not altered.
• RFID relationships: dynamic and change all the
time.
• Two types of dynamic relationships added:
– Event-based dynamic relationship. A timestamp
attribute added to represent the occurring timestamp
of the event.
– State-based dynamic relationship. tstart and tend
attributes added to represent the lifespan of a state.
• Static entity table
– OBJECT (object_epc, name, description)
– LOCATION (location_id, name, owner)
• Dynamic relationship tables
– OBSERVATION(sensor_epc, value, timestamp)
– OBJECTLOCATION(epc, location_id, tstart, tend)
– TRANSACTIONITEM(transaction_id, epc,
timestamp)
– SENSOR (sensor_epc, name, description)
– TRANSACTION (transaction_id, transaction_type)
– CONTAINMENT(epc, parent_epc, tstart, tend)
– SENSORLOCATION(sensor epc, location
id,position, tstart, tend)
Monitoring.
• Missing RFID Object Detection:
– Find when and where object holding EPC= `MEPC’
was lost.
• select location_id, tstart, tend from objectlocaiton
where epc='MEPC' and tstart = ( select max(o.tstart) from
objectlocation o where o.epc='MEPC' )
– Check if there are missing objects at current location C,
knowing that all objects were complete at previous
location L at time T.
• select l.epc from objectlocation l where l.location_id =
'L' and l.tstart <= 'T' and l.tend >= 'T' and l.epc not
in ( select c.epc from objectlocation c where
c.location_id = 'C' )
Tracking
• RFID Object Moving Time Inquiry:
– Time it takes to supply ‘OEPC’ from location S to
location E?
• select (e.tstart-s.tstart) as supplying_time from
objectlocation e, objectlocation s where e.epc =
'OEPC' and s.epc='OEPC' and s.location_id ='S' and
e.locaiton_id='E'
Compression Idea
• Bulky object movements
– Objects often move and stay together through the supply chain.
– If 1000 packs of product P stay together at the distribution center,
register a single record.
– (GID, distribution center, time_in, time_out).
– GID is a generalized identifier that represents the 1000 packs that stayed
together at the distribution center
• Analysis usually takes place at a much higher level of abstraction
than the one present in raw RFID data
Factory
Dist. Center 1
Dist. Center2
…
10 pallets
(1000 cases)
store 1
store 2
…
20 cases
(1000 packs)
shelf 1
shelf 2
…
10 packs
(12 sodas)
RFID Cuboids
• Fact Table: (EPC, location, time_in, time_out).
• In supply chain: Items travel through a series of locations.
• Query: what is the average time that product P stays at store in
Location A?
• Traditional cubes miss the path structure of the data
• Stay Table: (GIDs, location, time_in, time_out: measures):
– Records information on items that stay together at a given location
– If using record transitions: difficult to answer queries, lots of
intersections needed
• Map Table: (GID, <GID1,..,GIDn>)
– Links together stages that belong to the same path. Provides additional:
compression and query processing efficiency
– High level GID points to lower level GIDs
– If saving complete EPC Lists: high costs of IO to retrieve long lists,
costly query processing
• Information Table: (EPC list, attribute 1,...,attribute n)
– Records path-independent attributes of the items, e.g., color,
manufacturer, price..
EPC Overview
• Electronic product code
– Standard naming scheme, proposed by Auto-Id Center.
– An EPC uniquely identifies an item.
– Format: <Header, Manager_No., Object Class, Serial No.>
• Header: Identifies the length, type, structure, version and generation
of EPC.
• Manager Number: Identifies an organizational entity.
• Object Class: Identifies a “class”, or type of thing.
• Serial Number: Specific instance of the Object Class being tagged.
– We will refer to
• <Header, Manager No, Object Class>: Prefix
• <Serial No.>: Suffix
Use of Bitmap Datatype
• Observation: Items move together.
– Groups of items in the same proximity - e.g. on a shelf, on a
shipment
– Groups of items with same property - e.g. Same product
• Use a bitmap type for modeling a collection of EPCs
that can occur in item tracking applications.
– Instead of storing a tuple per item store a tuple for all the
items having same prefix.
– New extra fields instead of epc:
• <Len, Suffix_length, Prefix, suffix_start, Suffix_end, bitmap>
Example: Product Inventory
• With EPC Collections • With epc_bitmaps
Store_id Prod_id Time Item_collection
s1 p1 t1 epc11,
epc12,
epc13,
…
s1 p2 t2 epc21,
epc22,
epc23,
…
… … … …
Store_id Prod_id Time Item_bmap
s1 p1 t1 bmap1
s1 p2 t2 bmap2
… … … …
Use of Bitmap Datatype
Header EPC_Manager Object_Class Serial_Number
2-bits 21-bits 17-bits 24-bits
0x4AA890001F62C160
…………………………
0x4AA890001FA0B38E
Len Suff_len Prefix Suff_start Suff_end bitmap
64 24 0x4AA890001F 0x62C160 0xA0B38E 101001…00010
Bitmap Operations
• To use this with such datatype in SQL, we need
operations on such bitmaps.
• Conversion and couting Operations: epc2Bmap,
bmap2Epc and bmap2Count
• Pairwise Logical Operations: bmapAnd, bmapOr,
bmapMinus, and bmapXor
• Maintenance Operations: bmapInsert and bmapDelete
• Membership Testing Operation: bmapExists
• Comparison Operation: bmapEqual
Use of these operations in SQL
• Items added to a given shelf between time t1 and t2.
– SELECT bmap2Epc(bmapMinus(s2.item_bmap,
s1.item_bmap)) FROM Shelf_Inventory s1, Shelf_Inventory
s2 WHERE s1.shelf_id = <sid1> AND s1.shelf_id =
s2.shelf_id AND s1.time = <t1> AND s2.time = <t2>;
• Book store categorizes books in various categories.
– Following query determines the shelves where the books with
property ’Adventure’ and ’Romance’, are currently present in
the store.
– SELECT s.shelf_id FROM Shelf_Inventory s WHERE
bmap2Count(bmapAnd( s.item_bmap, SELECT
bmapAnd(p.Adventure, p.Romance) FROM
Propery_Inventory p) ) > 0; AND s.time=<current_date>;
Road Ahead
• Extension to bitmap proposal:
– Bitmap datatype is more appropriate for initial bulk-load & batch updates.
– It performs badly for incremental updates.
– A ‘hybrid Scheme’ for incremental Updates:
• Maintain inventories periodic checkpoints using bitmaps.
• For changes occurring between checkpoints, Maintain a traditional item-level
table.
• Answer queries by merging the latest checkpoint bitmap with the
corresponding duration’s item-level data.
• The epc_suffix in the collection may not be contiguous
– The bitmap will be sparse- Lot of zeros.
– Compress this using some encoding scheme
• Good for initial bulk loading and batch updates
• May reduce efficiency of bitmap operations.
Open Problems
• Efficient methods data mining problems
– Trend analysis
– Outlier detection
– Path clustering
• We will try exploring data mining applications to
RFID data.
RFID Data Cleaning
Issues in Data Cleaning
• Lack of Completeness
– RFID readers capture only 60-70% of all tags that are in the
vicinity
– Smoothing of data is done to rectify the loss of intermediate
messages
• Temporal Nature of data or tag dynamics
– RFID tags are in motion and that is what makes them more
difficult to handle
– But motion of a tag causes dropping of messages
• RFID data streams are very fast and are huge in
number
– Hence filtering is important before sending them to database
Current Strategies
• Temporal Granule:
– Based on the fact that tag data do not differ much
over a small time period
– Data can be clubbed on a small time frame
• Spatial Granule:
– Similarly, data from physically close readers are also
homogeneous
Stages of ESP
• Point: operates over a single value in a sensor
stream, filtered by a predicate in the WHERE
clause
• Smooth: granularity defined by applications to
correct for missed readings temporally (over one
input only); uses aggregate function over the
input.
• Merge: granularity specified by the application
to correct for missed readings spatially; grouped
by the specified spatial granule.
Stages of ESP (contd.)
• Arbitrate: deals with
conflicts between different
spatial granules; grouped by
spatial granule first and then
uses HAVING construct to
determine those conflicts
• Virtualize: used for
combining data streams from
different sources, could also
be different devices; join
construct is used to combine
the different data streams
and then filtered using some
predicate
Smooth stage
• False Positives: (erroneous readings) reporting objects
that are not actually present
• False Negatives: (missed readings) not reporting objects
that actually are present
False positives and False Negatives [Jeff06]
Tag List
• The reader has an internal table called the Tag List.
• An epoch is the smallest unit of interaction between the reader
and the middleware.
• Every epoch consists of certain number of Interrogation cycles
• Interrogation Cycle is one run of the reader protocol to
determine all tags
• At every epoch the reader sends the tag list to the middleware.
Tag ID Responses Timestamp
12341234 6 t1
12347890 1 t2
SMURF – Per tag Cleaning
• SMURF uses statistical methods to reduce the false
negative and false positives happening in the RFID
stream.
• The goal here is two fold: one is to determine the
statistical window size, and secondly, ensuring that the
transition of the tags is determined.
• To determine the window size we need to fit a
probability distribution to the sample size
• And to determine the transition of the tag out of the
reader's vicinity, we define a 98% confidence interval
within that probability distribution function on the
sample size |Si|.
SMURF – Per tag Cleaning (contd.)
• Using the tag list, per-epoch sampling
probability, pi,t is determined,
pi,t = number of times tag was read in a epoch /
interrogation cycles per epoch
• We average this over the sample size |Si| to get
the average read rate (pi
avg) for a tag i.
• If same probability of pi is assumed for each
epoch throughout the window then each
successful observation is like a Bernoulli trail.
SMURF – Per tag Cleaning (contd.)
• So, |Si| is the binomial random variable for a sample Si
with mean = wi. pi
avg and variance = wi. pi
avg. (1-pi
avg)
• Now using this we can express the window size as a limit,
• If the current window size is less than the calculated one
then the window size is adjusted accordingly.
• Similarly using the Central limit theorem for transition
detection we get ||Si| - μ| > 2 σ
Normal Sliding window….
• Epoch based mid-point sliding window
• Emits a reading with an epoch value corresponding to
the middle of the window
Ensuring Completeness
• In the first window, pi
avg demands a larger window
• Thus window size is increased
Transition Detection
• In the first window the number of readings decreases
significantly (and statistically)
• Thus a transition is likely to have occurred; so window
is halved
[Fraklin06]
SMURF – Multi-tag aggregate
Cleaning
• Similar to per-tag cleaning, the window for multi-tag cleaning is
determined by:
Here, pavg is the average per-epoch sampling probability over all
observed tags.
• To detect the transition in population count, we estimate the
population count of two windows [t – wi, t] and [t – wi/2, t]; with
true populations: Nw & Nw’
• Thus, for a transition to have happened, we need the difference
between the two estimates to be within the limit:
2(σw + σw’)
SMURF – Multi-tag aggregate
Cleaning
• To calculate the estimate of population count, we use
π-estimators; The estimated population count is given
by:
• Similarly by π-estimators, and assuming independence
across different tags, the variance of the estimate is
estimated as:
• Here πi is probability of reading the tag i at least once
during the whole window, given by 1 – (1 – pi
avg)w
The Road ahead…
• Applications in RFID do not accept any delays in the
data delivery
• Data is either present in the cache or the database; data
in the database increases processing time and data in
cache does not understand SQL like queries
• Anomaly detection in object tracking is also an
important part of object tracking
• Issues like untraceability, forward security, and database
desynchronization are still not completely resolved.
• One more serious problem with RFID is counterfeiting
• In the next stage we expect to look into some of these
issues
????
Thank You.
References
• Xiaolei Li, Hector Gonzalez, Jiawei Han and
Diego Klabjan. Warehousing and analyzing
massive RFID data sets. ICDE, 2006.
• Fusheng Wang and Peiya Liu. Temporal
management of RFID data. VLDB, 2005.
• Timothy Chorma, Ying Hu, Seema Sundara and
Jagannathan Srinivasan. Supporting RFID-based
item tracking applications in oracle DBMS using
a bitmap datatype. VLDB, 2005.
References
• Minos Garofalakis, Shawn R. Jeffery and Michael J.
Franklin. Adaptive cleaning for RFID data streams.
VLDB, 2006.
• J. Franklin, Wei Hong, Shawn R. Jeffery, Gustavo
Alonso and Jennifer Widom. Declarative support for
sensor data cleaning. In Pervasive, 2006.
• Sridhar Ramachandran Sudarshan S. Chawathe, Venkat
Krishnamurthy and Sanjay E. Sarma. Managing RFID
data. VLDB, 2004.
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt
ppt

Weitere ähnliche Inhalte

Ähnlich wie ppt

Presentation 1 rfid introduction
Presentation 1 rfid introductionPresentation 1 rfid introduction
Presentation 1 rfid introductionMouhanad Alkhaldi
 
RFID Based Asset Management case stories
RFID Based Asset Management case storiesRFID Based Asset Management case stories
RFID Based Asset Management case storiesLeon Smiers
 
Automated Storage/Retrieval System and Automatic Identification and Data Capt...
Automated Storage/Retrieval System and Automatic Identification and Data Capt...Automated Storage/Retrieval System and Automatic Identification and Data Capt...
Automated Storage/Retrieval System and Automatic Identification and Data Capt...vishaldattKohir1
 
Temporal Pattern Mining
Temporal Pattern MiningTemporal Pattern Mining
Temporal Pattern MiningPrakhar Dhama
 
03 internet-of-things-rfid-systems-and-applications
03 internet-of-things-rfid-systems-and-applications03 internet-of-things-rfid-systems-and-applications
03 internet-of-things-rfid-systems-and-applicationsJohn Soldatos
 
GrandesMentes_Library AutomationSolution.pptx
GrandesMentes_Library AutomationSolution.pptxGrandesMentes_Library AutomationSolution.pptx
GrandesMentes_Library AutomationSolution.pptxNishant Dean
 
15215237 pss7-ans
15215237 pss7-ans15215237 pss7-ans
15215237 pss7-anstigertsang
 
Barcode & RFiD in Supply Chain
Barcode & RFiD in Supply ChainBarcode & RFiD in Supply Chain
Barcode & RFiD in Supply ChainExistco Pty Ltd
 
Building your big data solution
Building your big data solution Building your big data solution
Building your big data solution WSO2
 
Skillwise - Enhancing dotnet app
Skillwise - Enhancing dotnet appSkillwise - Enhancing dotnet app
Skillwise - Enhancing dotnet appSkillwise Group
 
Elag 2012 - Under the hood of 3TU.Datacentrum.
Elag 2012 - Under the hood of 3TU.Datacentrum.Elag 2012 - Under the hood of 3TU.Datacentrum.
Elag 2012 - Under the hood of 3TU.Datacentrum.Egbert Gramsbergen
 
ashok mule rfid presentation
ashok mule   rfid presentationashok mule   rfid presentation
ashok mule rfid presentationAkash Maurya
 
Tamper Detection & Discrimination in Passive RFID Systems using Steganography
Tamper Detection & Discrimination in Passive RFID Systems using SteganographyTamper Detection & Discrimination in Passive RFID Systems using Steganography
Tamper Detection & Discrimination in Passive RFID Systems using SteganographyManishgant A Padmanabhan
 
Leveraging Big Data and Real-Time Analytics at Cxense
Leveraging Big Data and Real-Time Analytics at CxenseLeveraging Big Data and Real-Time Analytics at Cxense
Leveraging Big Data and Real-Time Analytics at CxenseSimon Lia-Jonassen
 
Discovering Things and Things’ data/services
Discovering Things and  Things’ data/servicesDiscovering Things and  Things’ data/services
Discovering Things and Things’ data/servicesPayamBarnaghi
 

Ähnlich wie ppt (20)

Presentation 1 rfid introduction
Presentation 1 rfid introductionPresentation 1 rfid introduction
Presentation 1 rfid introduction
 
RFID Based Asset Management case stories
RFID Based Asset Management case storiesRFID Based Asset Management case stories
RFID Based Asset Management case stories
 
RFID Technology
RFID TechnologyRFID Technology
RFID Technology
 
Automated Storage/Retrieval System and Automatic Identification and Data Capt...
Automated Storage/Retrieval System and Automatic Identification and Data Capt...Automated Storage/Retrieval System and Automatic Identification and Data Capt...
Automated Storage/Retrieval System and Automatic Identification and Data Capt...
 
rfid presentation
rfid presentationrfid presentation
rfid presentation
 
Temporal Pattern Mining
Temporal Pattern MiningTemporal Pattern Mining
Temporal Pattern Mining
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
 
03 internet-of-things-rfid-systems-and-applications
03 internet-of-things-rfid-systems-and-applications03 internet-of-things-rfid-systems-and-applications
03 internet-of-things-rfid-systems-and-applications
 
GrandesMentes_Library AutomationSolution.pptx
GrandesMentes_Library AutomationSolution.pptxGrandesMentes_Library AutomationSolution.pptx
GrandesMentes_Library AutomationSolution.pptx
 
R1x g22 rfid ii
R1x g22 rfid iiR1x g22 rfid ii
R1x g22 rfid ii
 
15215237 pss7-ans
15215237 pss7-ans15215237 pss7-ans
15215237 pss7-ans
 
matdid473708.pdf
matdid473708.pdfmatdid473708.pdf
matdid473708.pdf
 
Barcode & RFiD in Supply Chain
Barcode & RFiD in Supply ChainBarcode & RFiD in Supply Chain
Barcode & RFiD in Supply Chain
 
Building your big data solution
Building your big data solution Building your big data solution
Building your big data solution
 
Skillwise - Enhancing dotnet app
Skillwise - Enhancing dotnet appSkillwise - Enhancing dotnet app
Skillwise - Enhancing dotnet app
 
Elag 2012 - Under the hood of 3TU.Datacentrum.
Elag 2012 - Under the hood of 3TU.Datacentrum.Elag 2012 - Under the hood of 3TU.Datacentrum.
Elag 2012 - Under the hood of 3TU.Datacentrum.
 
ashok mule rfid presentation
ashok mule   rfid presentationashok mule   rfid presentation
ashok mule rfid presentation
 
Tamper Detection & Discrimination in Passive RFID Systems using Steganography
Tamper Detection & Discrimination in Passive RFID Systems using SteganographyTamper Detection & Discrimination in Passive RFID Systems using Steganography
Tamper Detection & Discrimination in Passive RFID Systems using Steganography
 
Leveraging Big Data and Real-Time Analytics at Cxense
Leveraging Big Data and Real-Time Analytics at CxenseLeveraging Big Data and Real-Time Analytics at Cxense
Leveraging Big Data and Real-Time Analytics at Cxense
 
Discovering Things and Things’ data/services
Discovering Things and  Things’ data/servicesDiscovering Things and  Things’ data/services
Discovering Things and Things’ data/services
 

Mehr von Videoguy

Energy-Aware Wireless Video Streaming
Energy-Aware Wireless Video StreamingEnergy-Aware Wireless Video Streaming
Energy-Aware Wireless Video StreamingVideoguy
 
Microsoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_PresMicrosoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_PresVideoguy
 
Proxy Cache Management for Fine-Grained Scalable Video Streaming
Proxy Cache Management for Fine-Grained Scalable Video StreamingProxy Cache Management for Fine-Grained Scalable Video Streaming
Proxy Cache Management for Fine-Grained Scalable Video StreamingVideoguy
 
Free-riding Resilient Video Streaming in Peer-to-Peer Networks
Free-riding Resilient Video Streaming in Peer-to-Peer NetworksFree-riding Resilient Video Streaming in Peer-to-Peer Networks
Free-riding Resilient Video Streaming in Peer-to-Peer NetworksVideoguy
 
Instant video streaming
Instant video streamingInstant video streaming
Instant video streamingVideoguy
 
Video Streaming over Bluetooth: A Survey
Video Streaming over Bluetooth: A SurveyVideo Streaming over Bluetooth: A Survey
Video Streaming over Bluetooth: A SurveyVideoguy
 
Video Streaming
Video StreamingVideo Streaming
Video StreamingVideoguy
 
Reaching a Broader Audience
Reaching a Broader AudienceReaching a Broader Audience
Reaching a Broader AudienceVideoguy
 
Considerations for Creating Streamed Video Content over 3G ...
Considerations for Creating Streamed Video Content over 3G ...Considerations for Creating Streamed Video Content over 3G ...
Considerations for Creating Streamed Video Content over 3G ...Videoguy
 
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMINGADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMINGVideoguy
 
Impact of FEC Overhead on Scalable Video Streaming
Impact of FEC Overhead on Scalable Video StreamingImpact of FEC Overhead on Scalable Video Streaming
Impact of FEC Overhead on Scalable Video StreamingVideoguy
 
Application Brief
Application BriefApplication Brief
Application BriefVideoguy
 
Video Streaming Services – Stage 1
Video Streaming Services – Stage 1Video Streaming Services – Stage 1
Video Streaming Services – Stage 1Videoguy
 
Streaming Video into Second Life
Streaming Video into Second LifeStreaming Video into Second Life
Streaming Video into Second LifeVideoguy
 
Flash Live Video Streaming Software
Flash Live Video Streaming SoftwareFlash Live Video Streaming Software
Flash Live Video Streaming SoftwareVideoguy
 
Videoconference Streaming Solutions Cookbook
Videoconference Streaming Solutions CookbookVideoconference Streaming Solutions Cookbook
Videoconference Streaming Solutions CookbookVideoguy
 
Streaming Video Formaten
Streaming Video FormatenStreaming Video Formaten
Streaming Video FormatenVideoguy
 
iPhone Live Video Streaming Software
iPhone Live Video Streaming SoftwareiPhone Live Video Streaming Software
iPhone Live Video Streaming SoftwareVideoguy
 
Glow: Video streaming training guide - Firefox
Glow: Video streaming training guide - FirefoxGlow: Video streaming training guide - Firefox
Glow: Video streaming training guide - FirefoxVideoguy
 

Mehr von Videoguy (20)

Energy-Aware Wireless Video Streaming
Energy-Aware Wireless Video StreamingEnergy-Aware Wireless Video Streaming
Energy-Aware Wireless Video Streaming
 
Microsoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_PresMicrosoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_Pres
 
Proxy Cache Management for Fine-Grained Scalable Video Streaming
Proxy Cache Management for Fine-Grained Scalable Video StreamingProxy Cache Management for Fine-Grained Scalable Video Streaming
Proxy Cache Management for Fine-Grained Scalable Video Streaming
 
Adobe
AdobeAdobe
Adobe
 
Free-riding Resilient Video Streaming in Peer-to-Peer Networks
Free-riding Resilient Video Streaming in Peer-to-Peer NetworksFree-riding Resilient Video Streaming in Peer-to-Peer Networks
Free-riding Resilient Video Streaming in Peer-to-Peer Networks
 
Instant video streaming
Instant video streamingInstant video streaming
Instant video streaming
 
Video Streaming over Bluetooth: A Survey
Video Streaming over Bluetooth: A SurveyVideo Streaming over Bluetooth: A Survey
Video Streaming over Bluetooth: A Survey
 
Video Streaming
Video StreamingVideo Streaming
Video Streaming
 
Reaching a Broader Audience
Reaching a Broader AudienceReaching a Broader Audience
Reaching a Broader Audience
 
Considerations for Creating Streamed Video Content over 3G ...
Considerations for Creating Streamed Video Content over 3G ...Considerations for Creating Streamed Video Content over 3G ...
Considerations for Creating Streamed Video Content over 3G ...
 
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMINGADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
 
Impact of FEC Overhead on Scalable Video Streaming
Impact of FEC Overhead on Scalable Video StreamingImpact of FEC Overhead on Scalable Video Streaming
Impact of FEC Overhead on Scalable Video Streaming
 
Application Brief
Application BriefApplication Brief
Application Brief
 
Video Streaming Services – Stage 1
Video Streaming Services – Stage 1Video Streaming Services – Stage 1
Video Streaming Services – Stage 1
 
Streaming Video into Second Life
Streaming Video into Second LifeStreaming Video into Second Life
Streaming Video into Second Life
 
Flash Live Video Streaming Software
Flash Live Video Streaming SoftwareFlash Live Video Streaming Software
Flash Live Video Streaming Software
 
Videoconference Streaming Solutions Cookbook
Videoconference Streaming Solutions CookbookVideoconference Streaming Solutions Cookbook
Videoconference Streaming Solutions Cookbook
 
Streaming Video Formaten
Streaming Video FormatenStreaming Video Formaten
Streaming Video Formaten
 
iPhone Live Video Streaming Software
iPhone Live Video Streaming SoftwareiPhone Live Video Streaming Software
iPhone Live Video Streaming Software
 
Glow: Video streaming training guide - Firefox
Glow: Video streaming training guide - FirefoxGlow: Video streaming training guide - Firefox
Glow: Video streaming training guide - Firefox
 

ppt

  • 1. RFID Data Management Kamlesh Laddhad (05329014) Karthik B.(05329021) Guide: Prof. Bernard Menezes
  • 2. Outline • Introduction to RFID Technology. • Issues with RFID Technology. • RFID Data Characteristics. • Data Warehousing. – Expressive Temporal Model: Dynamic Relationship ER Model – RFID - Cuboids. – Use of Bitmap Datatype. • Data Cleaning. – Extensible Sensor stream Processing (ESP) – Statistical sMoothing for Unreliable RFid data.(SMURF) • Future Plans.
  • 3. Introduction • Radio Frequency Identification: – It is an Automatic Identification and Data Capture Technology. – Fast – No contact or line of sight. – Uses radio-frequency waves to transfer data • Components – Tag: small, low-cost device that can hold a limited amount of data. • Associated with objects, such as pallets, cases, and even individual items. – Reader: Recognize presence of tag and read info stored on it. • Unique electronic product code (EPC) associated with a tag. • By placing RFID tag readers at various locations, one can track the movement of objects through supply chain networks.
  • 4. Applications and Adoptions • Supply Chain Management: real-time inventory tracking. – US Department Of Defense: shipments to armed forces • Retail: Active shelves monitor product availability – Wal-Mart, Albertson: Major Retails stores • Access control: toll collection, transportation. – Airline luggage management: • British airways:20 million bags a year • Implemented to reduce lost/misplaced luggage • Anti-counterfeiting and security: – Food and Drug Administration: To reduce counterfeit in pharmaceutical supply chain
  • 5. Prospective for RFID research • The physics of building tags and readers – Tags have few gates: Apart from basic operation, very less computing power. – Radio-frequency has some issues with operating in certain physical mediums. • The privacy and safety issues: – Complex encryption schemes are not possible on RFID tags. – Counterfeiting by means of either illegitimate readers or spoofed tags are possible – Reader-tag communication is wireless: Third parties can eavesdrop on signals. • Software Architecture to collect, filter, organize, and answer online queries: – No. of tags are proportional to No of items being serviced/tracked. – No. of readers are proportional to traceable strategic locations/areas • Each Reader picks up tag signals on continuous basis. • Data generated by RFID systems is enormous: • E.g. Wal-Mart is expected to generate 7 terabytes of RFID data per day. • Our Focus: Third Stream.
  • 7. Data Management Challenges • Data Explosion : Example – A retailer with 3,000 stores, selling 10,000 items a day per store. – Each item moves 10 times on average before being sold • Movement recorded as (EPC, location, second) – Data volume: 300 million tuples per day. – Example OLAP Query: “Average time for items to move from warehouse to checkout counter in March 2006?”. • Costly to answer if there are a billion tuples for March 2006.
  • 8. Data Characteristics • Temporal and history oriented – Applications dynamically generate observations (readings). – Objects location and containment relationship among objects changes – Need: Expressive data model. • Inaccurate data and implicit semantics – False positive: Non-existing tag incorrectly read. – False Negative: Reader missed a tag which was in its vicinity. – Noisy data & duplicate readings (redundancy): Same tag read more than once. – Need: Automated data filtering and transformation. • Streaming and large volume – Object stay in place for longer duration: Readers records them periodically. Large data keeps generating. – We need to preserve this data for tracking and monitoring. – Need: Scalable storage scheme, compression techniques to reduce data. • Data Granularity – Data collection granularity needs to be decided – Differs across applications.
  • 9. Warehousing Helps!! • Lossless compression – Remove redundancy: (r1,l1,t1) (r1,l1,t2) ... (r1,l1,t10) => (r1,l1,t1,t10) – Group objects that move and stay together. • Data cleaning: Multi-reading, missed-reading, error-reading, bulky movement. • Data mining: Find trends, outliers, frequent, sequential, flow patterns. • Multi-dimensional summary: product, location, time, … – Store manager: Check item movements from the backroom to different shelves in his store – Region manager: Collapse intra-store movements and look at distribution centers, warehouses, and stores • Query Processing – Support for OLAP: roll-up, drill-down, slice, and dice – Path query: New to RFID-Warehouses, about the structure of paths • What products that go through quality control have shorter paths? • What locations are common to the paths of a set of defective auto-parts? • Identify containers at a port that have deviated from their historic paths
  • 10. Dynamic Relationship ER Model • Proposed by Wang and Liu from Siemens. • RFID entities are static and are not altered. • RFID relationships: dynamic and change all the time. • Two types of dynamic relationships added: – Event-based dynamic relationship. A timestamp attribute added to represent the occurring timestamp of the event. – State-based dynamic relationship. tstart and tend attributes added to represent the lifespan of a state.
  • 11. • Static entity table – OBJECT (object_epc, name, description) – LOCATION (location_id, name, owner) • Dynamic relationship tables – OBSERVATION(sensor_epc, value, timestamp) – OBJECTLOCATION(epc, location_id, tstart, tend) – TRANSACTIONITEM(transaction_id, epc, timestamp) – SENSOR (sensor_epc, name, description) – TRANSACTION (transaction_id, transaction_type) – CONTAINMENT(epc, parent_epc, tstart, tend) – SENSORLOCATION(sensor epc, location id,position, tstart, tend)
  • 12. Monitoring. • Missing RFID Object Detection: – Find when and where object holding EPC= `MEPC’ was lost. • select location_id, tstart, tend from objectlocaiton where epc='MEPC' and tstart = ( select max(o.tstart) from objectlocation o where o.epc='MEPC' ) – Check if there are missing objects at current location C, knowing that all objects were complete at previous location L at time T. • select l.epc from objectlocation l where l.location_id = 'L' and l.tstart <= 'T' and l.tend >= 'T' and l.epc not in ( select c.epc from objectlocation c where c.location_id = 'C' )
  • 13. Tracking • RFID Object Moving Time Inquiry: – Time it takes to supply ‘OEPC’ from location S to location E? • select (e.tstart-s.tstart) as supplying_time from objectlocation e, objectlocation s where e.epc = 'OEPC' and s.epc='OEPC' and s.location_id ='S' and e.locaiton_id='E'
  • 14. Compression Idea • Bulky object movements – Objects often move and stay together through the supply chain. – If 1000 packs of product P stay together at the distribution center, register a single record. – (GID, distribution center, time_in, time_out). – GID is a generalized identifier that represents the 1000 packs that stayed together at the distribution center • Analysis usually takes place at a much higher level of abstraction than the one present in raw RFID data Factory Dist. Center 1 Dist. Center2 … 10 pallets (1000 cases) store 1 store 2 … 20 cases (1000 packs) shelf 1 shelf 2 … 10 packs (12 sodas)
  • 15. RFID Cuboids • Fact Table: (EPC, location, time_in, time_out). • In supply chain: Items travel through a series of locations. • Query: what is the average time that product P stays at store in Location A? • Traditional cubes miss the path structure of the data • Stay Table: (GIDs, location, time_in, time_out: measures): – Records information on items that stay together at a given location – If using record transitions: difficult to answer queries, lots of intersections needed • Map Table: (GID, <GID1,..,GIDn>) – Links together stages that belong to the same path. Provides additional: compression and query processing efficiency – High level GID points to lower level GIDs – If saving complete EPC Lists: high costs of IO to retrieve long lists, costly query processing • Information Table: (EPC list, attribute 1,...,attribute n) – Records path-independent attributes of the items, e.g., color, manufacturer, price..
  • 16. EPC Overview • Electronic product code – Standard naming scheme, proposed by Auto-Id Center. – An EPC uniquely identifies an item. – Format: <Header, Manager_No., Object Class, Serial No.> • Header: Identifies the length, type, structure, version and generation of EPC. • Manager Number: Identifies an organizational entity. • Object Class: Identifies a “class”, or type of thing. • Serial Number: Specific instance of the Object Class being tagged. – We will refer to • <Header, Manager No, Object Class>: Prefix • <Serial No.>: Suffix
  • 17. Use of Bitmap Datatype • Observation: Items move together. – Groups of items in the same proximity - e.g. on a shelf, on a shipment – Groups of items with same property - e.g. Same product • Use a bitmap type for modeling a collection of EPCs that can occur in item tracking applications. – Instead of storing a tuple per item store a tuple for all the items having same prefix. – New extra fields instead of epc: • <Len, Suffix_length, Prefix, suffix_start, Suffix_end, bitmap>
  • 18. Example: Product Inventory • With EPC Collections • With epc_bitmaps Store_id Prod_id Time Item_collection s1 p1 t1 epc11, epc12, epc13, … s1 p2 t2 epc21, epc22, epc23, … … … … … Store_id Prod_id Time Item_bmap s1 p1 t1 bmap1 s1 p2 t2 bmap2 … … … …
  • 19. Use of Bitmap Datatype Header EPC_Manager Object_Class Serial_Number 2-bits 21-bits 17-bits 24-bits 0x4AA890001F62C160 ………………………… 0x4AA890001FA0B38E Len Suff_len Prefix Suff_start Suff_end bitmap 64 24 0x4AA890001F 0x62C160 0xA0B38E 101001…00010
  • 20. Bitmap Operations • To use this with such datatype in SQL, we need operations on such bitmaps. • Conversion and couting Operations: epc2Bmap, bmap2Epc and bmap2Count • Pairwise Logical Operations: bmapAnd, bmapOr, bmapMinus, and bmapXor • Maintenance Operations: bmapInsert and bmapDelete • Membership Testing Operation: bmapExists • Comparison Operation: bmapEqual
  • 21. Use of these operations in SQL • Items added to a given shelf between time t1 and t2. – SELECT bmap2Epc(bmapMinus(s2.item_bmap, s1.item_bmap)) FROM Shelf_Inventory s1, Shelf_Inventory s2 WHERE s1.shelf_id = <sid1> AND s1.shelf_id = s2.shelf_id AND s1.time = <t1> AND s2.time = <t2>; • Book store categorizes books in various categories. – Following query determines the shelves where the books with property ’Adventure’ and ’Romance’, are currently present in the store. – SELECT s.shelf_id FROM Shelf_Inventory s WHERE bmap2Count(bmapAnd( s.item_bmap, SELECT bmapAnd(p.Adventure, p.Romance) FROM Propery_Inventory p) ) > 0; AND s.time=<current_date>;
  • 22. Road Ahead • Extension to bitmap proposal: – Bitmap datatype is more appropriate for initial bulk-load & batch updates. – It performs badly for incremental updates. – A ‘hybrid Scheme’ for incremental Updates: • Maintain inventories periodic checkpoints using bitmaps. • For changes occurring between checkpoints, Maintain a traditional item-level table. • Answer queries by merging the latest checkpoint bitmap with the corresponding duration’s item-level data. • The epc_suffix in the collection may not be contiguous – The bitmap will be sparse- Lot of zeros. – Compress this using some encoding scheme • Good for initial bulk loading and batch updates • May reduce efficiency of bitmap operations.
  • 23. Open Problems • Efficient methods data mining problems – Trend analysis – Outlier detection – Path clustering • We will try exploring data mining applications to RFID data.
  • 25. Issues in Data Cleaning • Lack of Completeness – RFID readers capture only 60-70% of all tags that are in the vicinity – Smoothing of data is done to rectify the loss of intermediate messages • Temporal Nature of data or tag dynamics – RFID tags are in motion and that is what makes them more difficult to handle – But motion of a tag causes dropping of messages • RFID data streams are very fast and are huge in number – Hence filtering is important before sending them to database
  • 26. Current Strategies • Temporal Granule: – Based on the fact that tag data do not differ much over a small time period – Data can be clubbed on a small time frame • Spatial Granule: – Similarly, data from physically close readers are also homogeneous
  • 27. Stages of ESP • Point: operates over a single value in a sensor stream, filtered by a predicate in the WHERE clause • Smooth: granularity defined by applications to correct for missed readings temporally (over one input only); uses aggregate function over the input. • Merge: granularity specified by the application to correct for missed readings spatially; grouped by the specified spatial granule.
  • 28. Stages of ESP (contd.) • Arbitrate: deals with conflicts between different spatial granules; grouped by spatial granule first and then uses HAVING construct to determine those conflicts • Virtualize: used for combining data streams from different sources, could also be different devices; join construct is used to combine the different data streams and then filtered using some predicate
  • 29. Smooth stage • False Positives: (erroneous readings) reporting objects that are not actually present • False Negatives: (missed readings) not reporting objects that actually are present False positives and False Negatives [Jeff06]
  • 30. Tag List • The reader has an internal table called the Tag List. • An epoch is the smallest unit of interaction between the reader and the middleware. • Every epoch consists of certain number of Interrogation cycles • Interrogation Cycle is one run of the reader protocol to determine all tags • At every epoch the reader sends the tag list to the middleware. Tag ID Responses Timestamp 12341234 6 t1 12347890 1 t2
  • 31. SMURF – Per tag Cleaning • SMURF uses statistical methods to reduce the false negative and false positives happening in the RFID stream. • The goal here is two fold: one is to determine the statistical window size, and secondly, ensuring that the transition of the tags is determined. • To determine the window size we need to fit a probability distribution to the sample size • And to determine the transition of the tag out of the reader's vicinity, we define a 98% confidence interval within that probability distribution function on the sample size |Si|.
  • 32. SMURF – Per tag Cleaning (contd.) • Using the tag list, per-epoch sampling probability, pi,t is determined, pi,t = number of times tag was read in a epoch / interrogation cycles per epoch • We average this over the sample size |Si| to get the average read rate (pi avg) for a tag i. • If same probability of pi is assumed for each epoch throughout the window then each successful observation is like a Bernoulli trail.
  • 33. SMURF – Per tag Cleaning (contd.) • So, |Si| is the binomial random variable for a sample Si with mean = wi. pi avg and variance = wi. pi avg. (1-pi avg) • Now using this we can express the window size as a limit, • If the current window size is less than the calculated one then the window size is adjusted accordingly. • Similarly using the Central limit theorem for transition detection we get ||Si| - μ| > 2 σ
  • 34. Normal Sliding window…. • Epoch based mid-point sliding window • Emits a reading with an epoch value corresponding to the middle of the window
  • 35. Ensuring Completeness • In the first window, pi avg demands a larger window • Thus window size is increased
  • 36. Transition Detection • In the first window the number of readings decreases significantly (and statistically) • Thus a transition is likely to have occurred; so window is halved [Fraklin06]
  • 37. SMURF – Multi-tag aggregate Cleaning • Similar to per-tag cleaning, the window for multi-tag cleaning is determined by: Here, pavg is the average per-epoch sampling probability over all observed tags. • To detect the transition in population count, we estimate the population count of two windows [t – wi, t] and [t – wi/2, t]; with true populations: Nw & Nw’ • Thus, for a transition to have happened, we need the difference between the two estimates to be within the limit: 2(σw + σw’)
  • 38. SMURF – Multi-tag aggregate Cleaning • To calculate the estimate of population count, we use π-estimators; The estimated population count is given by: • Similarly by π-estimators, and assuming independence across different tags, the variance of the estimate is estimated as: • Here πi is probability of reading the tag i at least once during the whole window, given by 1 – (1 – pi avg)w
  • 39. The Road ahead… • Applications in RFID do not accept any delays in the data delivery • Data is either present in the cache or the database; data in the database increases processing time and data in cache does not understand SQL like queries • Anomaly detection in object tracking is also an important part of object tracking • Issues like untraceability, forward security, and database desynchronization are still not completely resolved. • One more serious problem with RFID is counterfeiting • In the next stage we expect to look into some of these issues
  • 40. ????
  • 42. References • Xiaolei Li, Hector Gonzalez, Jiawei Han and Diego Klabjan. Warehousing and analyzing massive RFID data sets. ICDE, 2006. • Fusheng Wang and Peiya Liu. Temporal management of RFID data. VLDB, 2005. • Timothy Chorma, Ying Hu, Seema Sundara and Jagannathan Srinivasan. Supporting RFID-based item tracking applications in oracle DBMS using a bitmap datatype. VLDB, 2005.
  • 43. References • Minos Garofalakis, Shawn R. Jeffery and Michael J. Franklin. Adaptive cleaning for RFID data streams. VLDB, 2006. • J. Franklin, Wei Hong, Shawn R. Jeffery, Gustavo Alonso and Jennifer Widom. Declarative support for sensor data cleaning. In Pervasive, 2006. • Sridhar Ramachandran Sudarshan S. Chawathe, Venkat Krishnamurthy and Sanjay E. Sarma. Managing RFID data. VLDB, 2004.