Visual Information Retrieval in Endoscopic Video Archives

•

1 gefällt mir•818 views

This document discusses visual information retrieval in endoscopic surgery videos. It notes that large amounts of surgery video are recorded each day but are difficult to search and retrieve from. The approach uses temporal sampling of frames, and indexes and searches videos based on global and localized global features. It tests this approach on a dataset of over 33 hours of laparoscopy videos containing over 500,000 frames. Evaluation shows the approach can successfully re-find specific frames with near duplicates, and that late fusion of features and localized features like SIFT perform better than global features alone. User studies found the approach provides a useful starting point for interactive video retrieval to help surgeons re-find specific moments in long procedure videos.

Technologie

Visual Information Retrieval in
Endoscopic Video Archives
Jennifer Roldan Carlos, Mathias Lux, Xavier Giro-i-Nieto, Pia Munoz
& Nektarios Anagnostopoulos

Motivation
•  Surgery videos are taken every day
•  Operations rooms are fully booked
•  Many procedures already involve video
•  Storing videos is / will be req. by law

Amount of Videos
•  8-10 h operations / room and day
•  say 6 hours excluding set ups, etc.
•  5-6 days a week
•  1,560 h video / year & OR

Use Case of Re-finding Frames
•  Surgeons take „shots“
•  documentation, for patients, discussion
•  Shots are intentionally framed
•  and make for excellent
representative images

Approach
•  Temporal sampling: every 5th frame
•  Indexing and search based on
•  a set of global features
•  or a localized global features

Features Employed
•  Pyramid HOG
•  extensive and large texture feature
•  Color and Edge Directivity Descriptor
•  compact and well performing joint histogram
•  SIMPLE
•  CEDD descriptors of patches at SURF key points

Data Set
•  33 hours of video
•  from actual procedures focusing on laporoscopy
•  1,276 videos in total
•  593,446 frames after temporal sampling

Evaluation – Re-Finding in Numbers
•  Randomly selected more than 700 shots
•  Excluding tests, white balance and out-of-patient
•  Resulting in 600 sample queries

Evaluation – Re-Finding in Numbers
•  Hypothesis I: every 5th frame is enough to re-find
images.
•  Hypothesis II: There is a noticeable difference
between global and local features.

Evaluation – User Study
•  Exploratory study, thinking aloud test
•  Interactive web page presented to users
•  ten cases with all available shots as queries
•  three non-labeled search engines

Evaluation – User Study
•  Population drawn from our projects
•  experts in processing endoscopic videos
•  well-aware of the requirements surgeons registered
•  Task was to ...
•  browse diverse results and
•  voice drawbacks and benefits

Findings
•  Sampling every 5th frame works (with headroom)
•  Study participants noted that
•  late fusion works as expected and yields
interesting results besides near duplicates
•  SIMPLE works better for semantically similar
content, ie. translated instruments, etc.

Conclusions
•  The system does not utilize
•  domain dependent methods and heuristics
•  run-time and storage demanding methods
•  Still, it works out for the use case as a
•  candidate support system for surgeons
•  baseline to start on interactive video retrieval for
laporoscopy.

Future Work
•  Salient contours of images
•  focus on being robust against lighting and noise

Future Work
credits for feature & images: Chryssanthi Iakovidou

Time for questions?
Mathias Lux
± Associate Professor @ Klagenfurt University, Austria
mlux@itec.aau.at
Thanks go to Jennifer Roldan Carlos, Xavier Giro-i-Nieto,
Pia Munoz & Nektarios Anagnostopoulos

Weitere ähnliche Inhalte

Ähnlich wie Visual Information Retrieval in Endoscopic Video Archives

Chapter 8 Evaluation Techniques

MLG College of Learning, Inc

Research Proposal Presentation Pitch

tchoonyong

Remote unmoderated usability testing has become popular and for good reason: it empowers UX Researchers and Designers to conduct more studies with less resources, in less time, with the benefit of having participants in their natural environment. Are you missing out on this opportunity? Join Ann Rochanayon, Director of UX/CX Research at UserZoom, in this webinar on-demand to learn the basics of remote unmoderated usability testing and how to get started. This 30-min webinar on-demand covers: -An Introduction to unmoderated remote usability testing -Defining goals / hypothesis -Determining the tasks -Determining study length -Determining the panel source -General guidelines, types of questions to include, data collection -Sample intro questions, tasks and wrap-up questions

Conducting Remote Unmoderated Usability Testing: Part 1 - RemoteUX Training W...

UserZoom

Visual Search for Musical Performances and Endoscopic Videos

Universitat Politècnica de Catalunya

Chapter 8 eval. tech. lesson 2

MLG College of Learning, Inc

Human Computer Interaction Evaluation

LGS, GBHS&IC, University Of South-Asia, TARA-Technologies

Electronic Laboratory Notebooks

Kristin Briney

Requirements Engineering – Writing the Software Requirements Specification (SRS). A CRI Group Workshop. The requirements engineering approach employed successfully in the EHR4CR process is shown and discussed in order to extract lessons learned and to use it for new projects. Requirements engineering is the process of eliciting stakeholder needs and desires and develope them into an agreed set of detailed requirements. It serves as basis for all subsequent software development activities. In general, a project begins with the requirement acquisition phase and ends with the specification of requirements in form of the Software Requirement Specification (SRS). Requirements specification may even be used to manage the consistency of the entire system. Learning from the Requirements Engineering process in the the EU project EHR4CR. Especially the topics of Requirements Scenarios in the process of requirement gathering and the iterative writing and validation of software requirements specification (SRS) document can be applied to new projects. The Requirements Process consists of 4 steps: Requirements Elicitation – the art to receive meaningful requirements. Requirements Analysis – iterative improvement of quality of requirements. Writing the Requirements Specification document (Software Requirement Specification) and Requirements Validation - this is also done iteratively with several workshops. Novel is the introduction of an iterative process for requirements engineering. Start with only a subset of software requirements, iterate the collection and validation until the full system is implemented. In each iteration, design modifications are made and new functional capabilities are added. Following tools for requirements gathering were used: Use Cases, Descriptions of current situation and workflow, Context diagram, Stakeholder interviews, Scenarios and Use Case workshops. A novel scenario based approach for requirements engineering is being introduced: The domain scenario is used to estimate probable effects (situation analysis and long-range planning). The domain scenario is broken down into high-level "Usage Scenarios". Usage Scenarios describe critical business interactions and their anticipated operations; they serve as context for the use cases and the generation of requirements; they make sure requirements are complete. Development of the SRS with involvement of scenarios: 1. Begin with Domain Scenarios; 2. Development of Usage Scenarios; 3. Software Requirements Specification document. Several round of change management were employed during writing the SRS. This possibility for correction and improvement ensured that the requirements are of high quality and applicability.

Requirements engineering scenario based software requirement specification

Wolfgang Kuchinke

Website Usability & Eye-tracking by Marco Pretorious (Certified Usability Ana...

DrupalCape

UX Research in an Agile World

Hirajaved10

E3 chap-09

Welly Dian Astika

e3-chap-09.ppt

KingSh2

Evaluation techniques

PhD Research Scholar

Towards an Agile approach to building application profiles

Paul Walk

evaluation technique uni 2

vrgokila

How to Conduct Usability Studies: A Librarian Primer

Tao Zhang

Log Analysis to Understand Medical Professionals' Image Searching Behaviour

Institute of Information Systems (HES-SO)

Wojciech Galuba, Decision Tools Lead, Facebook Experimentation is a valuable tool for supporting product decisions, iterating on features and gaining actionable insights into people's behavior. In this session Wojciech Galbua, Data Scientist at Facebook, presents an overview of Facebook's experimentation framework and how it is used to make day-to-day data-driven decisions at global scale. The talk focuses on the challenges of building and scaling the analytics infrastructure, designing the tools for ease-of-use and ensuring broad adoption of sound experimentation methodologies across all the teams.

N=10^9: Automated Experimentation at Scale

Optimizely

Working with Instrument Data (GlobusWorld Tour - UMich)

Globus

ROLE OF DIGITAL IMAGING IN PATHOLOGY.pptx

aditisikarwar2

Ähnlich wie Visual Information Retrieval in Endoscopic Video Archives (20)

Chapter 8 Evaluation Techniques

Research Proposal Presentation Pitch

Conducting Remote Unmoderated Usability Testing: Part 1 - RemoteUX Training W...

Visual Search for Musical Performances and Endoscopic Videos

Chapter 8 eval. tech. lesson 2

Human Computer Interaction Evaluation

Electronic Laboratory Notebooks

Requirements engineering scenario based software requirement specification

Website Usability & Eye-tracking by Marco Pretorious (Certified Usability Ana...

UX Research in an Agile World

E3 chap-09

e3-chap-09.ppt

Evaluation techniques

Towards an Agile approach to building application profiles

evaluation technique uni 2

How to Conduct Usability Studies: A Librarian Primer

Log Analysis to Understand Medical Professionals' Image Searching Behaviour

N=10^9: Automated Experimentation at Scale

Working with Instrument Data (GlobusWorld Tour - UMich)

ROLE OF DIGITAL IMAGING IN PATHOLOGY.pptx

Mehr von Universitat Politècnica de Catalunya

This document provides an overview of deep generative learning and summarizes several key generative models including GANs, VAEs, diffusion models, and autoregressive models. It discusses the motivation for generative models and their applications such as image generation, text-to-image synthesis, and enhancing other media like video and speech. Example state-of-the-art models are provided for each application. The document also covers important concepts like the difference between discriminative and generative modeling, sampling techniques, and the training procedures for GANs and VAEs.

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)

Universitat Politècnica de Catalunya

Deep Generative Learning for All

Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...

Universitat Politècnica de Catalunya

Machine translation and computer vision have greatly benefited from the advances in deep learning. A large and diverse amount of textual and visual data have been used to train neural networks whether in a supervised or self-supervised manner. Nevertheless, the convergence of the two fields in sign language translation and production still poses multiple open challenges, like the low video resources, limitations in hand pose estimation, or 3D spatial grounding from poses.

Towards Sign Language Translation & Production | Xavier Giro-i-Nieto

Universitat Politècnica de Catalunya

The Transformer - Xavier Giró - UPC Barcelona 2021

Universitat Politècnica de Catalunya

Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...

Universitat Politècnica de Catalunya

Machine translation and computer vision have greatly benefited of the advances in deep learning. The large and diverse amount of textual and visual data have been used to train neural networks whether in a supervised or self-supervised manner. Nevertheless, the convergence of the two field in sign language translation and production is still poses multiple open challenges, like the low video resources, limitations in hand pose estimation, or 3D spatial grounding from poses. This talk will present these challenges and the How2✌️Sign dataset (https://how2sign.github.io) recorded at CMU in collaboration with UPC, BSC, Gallaudet University and Facebook. https://imatge.upc.edu/web/publications/sign-language-translation-and-production-multimedia-and-multimodal-challenges-all

Open challenges in sign language translation and production

Universitat Politècnica de Catalunya

https://imatge-upc.github.io/synthref/ Integrating computer vision with natural language processing has achieved significant progress over the last years owing to the continuous evolution of deep learning. A novel vision and language task, which is tackled in the present Master thesis is referring video object segmentation, in which a language query defines which instance to segment from a video sequence. One of the biggest challenges for this task is the lack of relatively large annotated datasets since a tremendous amount of time and human effort is required for annotation. Moreover, existing datasets suffer from poor quality annotations in the sense that approximately one out of ten language expressions fails to uniquely describe the target object. The purpose of the present Master thesis is to address these challenges by proposing a novel method for generating synthetic referring expressions for an image (video frame). This method pro- duces synthetic referring expressions by using only the ground-truth annotations of the objects as well as their attributes, which are detected by a state-of-the-art object detection deep neural network. One of the advantages of the proposed method is that its formulation allows its application to any object detection or segmentation dataset. By using the proposed method, the first large-scale dataset with synthetic referring expressions for video object segmentation is created, based on an existing large benchmark dataset for video instance segmentation. A statistical analysis and comparison of the created synthetic dataset with existing ones is also provided in the present Master thesis. The conducted experiments on three different datasets used for referring video object segmentation prove the efficiency of the generated synthetic data. More specifically, the obtained results demonstrate that by pre-training a deep neural network with the proposed synthetic dataset one can improve the ability of the network to generalize across different datasets, without any additional annotation cost. This outcome is even more important taking into account that no additional annotation cost is involved.

Generation of Synthetic Referring Expressions for Object Segmentation in Videos

Universitat Politècnica de Catalunya

Master MATT thesis defense by Juan José Nieto Advised by Víctor Campos and Xavier Giro-i-Nieto. 27th May 2021. Pre-training Reinforcement Learning (RL) agents in a task-agnostic manner has shown promising results. However, previous works still struggle to learn and discover meaningful skills in high-dimensional state-spaces. We approach the problem by leveraging unsupervised skill discovery and self-supervised learning of state representations. In our work, we learn a compact latent representation by making use of variational or contrastive techniques. We demonstrate that both allow learning a set of basic navigation skills by maximizing an information theoretic objective. We assess our method in Minecraft 3D maps with different complexities. Our results show that representations and conditioned policies learned from pixels are enough for toy examples, but do not scale to realistic and complex maps. We also explore alternative rewards and input observations to overcome these limitations. https://imatge.upc.edu/web/publications/discovery-and-learning-navigation-goals-pixels-minecraft

Discovery and Learning of Navigation Goals from Pixels in Minecraft

Universitat Politècnica de Catalunya

Peter Muschick MSc thesis Universitat Pollitecnica de Catalunya, 2020 Sign language recognition and translation has been an active research field in the recent years with most approaches using deep neural networks to extract information from sign language data. This work investigates the mostly disregarded approach of using human keypoint estimation from image and video data with OpenPose in combination with transformer network architecture. Firstly, it was shown that it is possible to recognize individual signs (4.5% word error rate (WER)). Continuous sign language recognition though was more error prone (77.3% WER) and sign language translation was not possible using the proposed methods, which might be due to low accuracy scores of human keypoint estimation by OpenPose and accompanying loss of information or insufficient capacities of the used transformer model. Results may improve with the use of datasets containing higher repetition rates of individual signs or focusing more precisely on keypoint extraction of hands.

Learn2Sign : Sign language recognition and translation using human keypoint e...

Universitat Politècnica de Catalunya

Intepretability / Explainable AI for Deep Neural Networks

Universitat Politècnica de Catalunya

Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020

Universitat Politècnica de Catalunya

Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...

Universitat Politècnica de Catalunya

Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020

Universitat Politècnica de Catalunya

https://telecombcn-dl.github.io/dlai-2020/ Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.

Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...

Universitat Politècnica de Catalunya

https://telecombcn-dl.github.io/drl-2020/ This course presents the principles of reinforcement learning as an artificial intelligence tool based on the interaction of the machine with its environment, with applications to control tasks (eg. robotics, autonomous driving) o decision making (eg. resource optimization in wireless communication networks). It also advances in the development of deep neural networks trained with little or no supervision, both for discriminative and generative tasks, with special attention on multimedia applications (vision, language and speech).

Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020

Universitat Politècnica de Catalunya

Giro-i-Nieto, X. One Perceptron to Rule Them All: Language, Vision, Audio and Speech. In Proceedings of the 2020 International Conference on Multimedia Retrieval (pp. 7-8). Tutorial page: https://imatge.upc.edu/web/publications/one-perceptron-rule-them-all-language-vision-audio-and-speech-tutorial Deep neural networks have boosted the convergence of multimedia data analytics in a unified framework shared by practitioners in natural language, vision and speech. Image captioning, lip reading or video sonorization are some of the first applications of a new and exciting field of research exploiting the generalization properties of deep neural representation. This tutorial will firstly review the basic neural architectures to encode and decode vision, text and audio, to later review the those models that have successfully translated information across modalities.

Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)

Universitat Politècnica de Catalunya

Image segmentation is a classic computer vision task that aims at labeling pixels with semantic classes. These slides provide an overview of the basic approaches applied from the deep learning field to tackle this challenge and presents the basic subtasks (semantic, instance and panoptic segmentation) and related datasets. Presented at the International Summer School on Deep Learning (ISSonDL) 2020 held online and organized by the University of Gdansk (Poland) between the 30th August and 2nd September. http://2020.dl-lab.eu/virtual-summer-school-on-deep-learning/

Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...

Universitat Politècnica de Catalunya

https://imatge-upc.github.io/rvos-mots/ Video object segmentation can be understood as a sequence-to-sequence task that can benefit from the curriculum learning strategies for better and faster training of deep neural networks. This work explores different schedule sampling and frame skipping variations to significantly improve the performance of a recurrent architecture. Our results on the car class of the KITTI-MOTS challenge indicate that, surprisingly, an inverse schedule sampling is a better option than a classic forward one. Also, that a progressive skipping of frames during training is beneficial, but only when training with the ground truth masks instead of the predicted ones.

Curriculum Learning for Recurrent Video Object Segmentation

Universitat Politècnica de Catalunya

Deep neural networks have achieved outstanding results in various applications such as vision, language, audio, speech, or reinforcement learning. These powerful function approximators typically require large amounts of data to be trained, which poses a challenge in the usual case where little labeled data is available. During the last year, multiple solutions have been proposed to leverage this problem, based on the concept of self-supervised learning, which can be understood as a specific case of unsupervised learning. This talk will cover its basic principles and provide examples in the field of multimedia.

Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020

Universitat Politècnica de Catalunya

Mehr von Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)

Deep Generative Learning for All

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...

Towards Sign Language Translation & Production | Xavier Giro-i-Nieto

The Transformer - Xavier Giró - UPC Barcelona 2021

Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...

Open challenges in sign language translation and production

Generation of Synthetic Referring Expressions for Object Segmentation in Videos

Discovery and Learning of Navigation Goals from Pixels in Minecraft

Learn2Sign : Sign language recognition and translation using human keypoint e...

Intepretability / Explainable AI for Deep Neural Networks

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020

Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...

Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020

Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...

Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020

Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)

Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...

Curriculum Learning for Recurrent Video Object Segmentation

Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020

Kürzlich hochgeladen

With more memory available, system performance of three Dell devices increased, which can translate to a better user experience Conclusion When your system has plenty of RAM to meet your needs, you can efficiently access the applications and data you need to finish projects and to-do lists without sacrificing time and focus. Our test results show that with more memory available, three Dell PCs delivered better performance and took less time to complete the Procyon Office Productivity benchmark. These advantages translate to users being able to complete workflows more quickly and multitask more easily. Whether you need the mobility of the Latitude 5440, the creative capabilities of the Precision 3470, or the high performance of the OptiPlex Tower Plus 7010, configuring your system with more RAM can help keep processes running smoothly, enabling you to do more without compromising performance.

Boost PC performance: How more available memory can improve productivity

Principled Technologies

GenCyber Cyber Security Day Presentation

Michael W. Hawkins

Join our latest Connector Corner webinar to discover how UiPath Integration Service revolutionizes API-centric automation in a 'Quote to Cash' process—and how that automation empowers businesses to accelerate revenue generation. A comprehensive demo will explore connecting systems, GenAI, and people, through powerful pre-built connectors designed to speed process cycle times. Speakers: James Dickson, Senior Software Engineer Charlie Greenberg, Host, Product Marketing Manager

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

DianaGray10

A Domino Admins Adventures (Engage 2024)

Gabriella Davis

Scaling API-first – The story of a global engineering organization Ian Reasor, Senior Computer Scientist - Adobe Radu Cotescu, Senior Computer Scientist - Adobe Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

apidays

Histor y of HAM Radio presentation slide

vu2urc

[2024]Digital Global Overview Report 2024 Meltwater.pdf

hans926745

Advantages of Hiring UIUX Design Service Providers for Your Business

Pixlogix Infotech

In this session, we will delve into strategic approaches for optimizing knowledge management within Microsoft 365, amidst the evolving landscape of Copilot. From leveraging automatic metadata classification and permission governance with SharePoint Premium, to unlocking Viva Engage for the cultivation of knowledge and communities, you will gain actionable insights to bolster your organization's knowledge-sharing initiatives. In this session, we will also explore how to facilitate solutions to enable your employees to find answers and expertise within Microsoft 365. You will leave equipped with practical techniques and a deeper understanding of how there is more to effective knowledge management than just enabling Copilot, but building actual solutions to prepare the knowledge that Copilot and your employees can use.

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Drew Madelung

Real Time Object Detection Using Open CV

Khem

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

GenAI Risks & Security Meetup 01052024.pdf

lior mazor

As privacy and data protection regulations evolve rapidly, organizations operating in multiple jurisdictions face mounting challenges to ensure compliance and safeguard customer data. With state-specific privacy laws coming up in multiple states this year, it is essential to understand what their unique data protection regulations will require clearly. How will data privacy evolve in the US in 2024? How to stay compliant? Our panellists will guide you through the intricacies of these states' specific data privacy laws, clarifying complex legal frameworks and compliance requirements. This webinar will review: - The essential aspects of each state's privacy landscape and the latest updates - Common compliance challenges faced by organizations operating in multiple states and best practices to achieve regulatory adherence - Valuable insights into potential changes to existing regulations and prepare your organization for the evolving landscape

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

TrustArc

The presentation explores the development and application of artificial intelligence (AI) from its inception to its current status in the modern world. The term "artificial intelligence" was first coined by John McCarthy in 1956 to describe efforts to develop computer programs capable of performing tasks that typically require human intelligence. This concept was first introduced at a conference held at Dartmouth College, where programs demonstrated capabilities such as playing chess, proving theorems, and interpreting texts. In the early stages, Alan Turing contributed to the field by defining intelligence as the ability of a being to respond to certain questions intelligently, proposing what is now known as the Turing Test to evaluate the presence of intelligent behavior in machines. As the decades progressed, AI evolved significantly. The 1980s focused on machine learning, teaching computers to learn from data, leading to the development of models that could improve their performance based on their experiences. The 1990s and 2000s saw further advances in algorithms and computational power, which allowed for more sophisticated data analysis techniques, including data mining. By the 2010s, the proliferation of big data and the refinement of deep learning techniques enabled AI to become mainstream. Notable milestones included the success of Google's AlphaGo and advancements in autonomous vehicles by companies like Tesla and Waymo. A major theme of the presentation is the application of generative AI, which has been used for tasks such as natural language text generation, translation, and question answering. Generative AI uses large datasets to train models that can then produce new, coherent pieces of text or other media. The presentation also discusses the ethical implications and the need for regulation in AI, highlighting issues such as privacy, bias, and the potential for misuse. These concerns have prompted calls for comprehensive regulations to ensure the safe and equitable use of AI technologies. Artificial intelligence has also played a significant role in healthcare, particularly highlighted during the COVID-19 pandemic, where it was used in drug discovery, vaccine development, and analyzing the spread of the virus. The capabilities of AI in healthcare are vast, ranging from medical diagnostics to personalized medicine, demonstrating the technology's potential to revolutionize fields beyond just technical or consumer applications. In conclusion, AI continues to be a rapidly evolving field with significant implications for various aspects of society. The development from theoretical concepts to real-world applications illustrates both the potential benefits and the challenges that come with integrating advanced technologies into everyday life. The ongoing discussion about AI ethics and regulation underscores the importance of managing these technologies responsibly to maximize their their benefits while minimizing potential harms.

Artificial Intelligence: Facts and Myths

Joaquim Jorge

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Rafal Los

Effective data discovery is crucial for maintaining compliance and mitigating risks in today's rapidly evolving privacy landscape. However, traditional manual approaches often struggle to keep pace with the growing volume and complexity of data. Join us for an insightful webinar where industry leaders from TrustArc and Privya will share their expertise on leveraging AI-powered solutions to revolutionize data discovery. You'll learn how to: - Effortlessly maintain a comprehensive, up-to-date data inventory - Harness code scanning insights to gain complete visibility into data flows leveraging the advantages of code scanning over DB scanning - Simplify compliance by leveraging Privya's integration with TrustArc - Implement proven strategies to mitigate third-party risks Our panel of experts will discuss real-world case studies and share practical strategies for overcoming common data discovery challenges. They'll also explore the latest trends and innovations in AI-driven data management, and how these technologies can help organizations stay ahead of the curve in an ever-changing privacy landscape.

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

TrustArc

What are drone anti-jamming systems? The drone anti-jamming systems and anti-spoof technology protect against interference, jamming, and spoofing of the UAVs. To protect their security, countries are beginning to research drone anti-jamming systems, also known as drone strike weapons. The anti-jam and anti-spoof technology protects against interference, jamming and spoofing. A drone strike weapon is a drone attack weapon that can attack and destroy enemy drones. So what is so unique about this amazing system?

What Are The Drone Anti-jamming Systems Technology?

Antenna Manufacturer Coco

Strategies for Landing an Oracle DBA Job as a Fresher

Remote DBA Services

Three things you will take away from the session: • How to run an effective tenant-to-tenant migration • Best practices for before, during, and after migration • Tips for using migration as a springboard to prepare for Copilot in Microsoft 365 Main ideas: Migration Overview: The presentation covers the current reality of cross-tenant migrations, the triggers, phases, best practices, and benefits of a successful tenant migration Considerations: When considering a migration, it is important to consider the migration scope, performance, customization, flexibility, user-friendly interface, automation, monitoring, support, training, scalability, data integrity, data security, cost, and licensing structure Next Wave: The next wave of change includes the launch of Copilot, which requires businesses to be prepared for upcoming changes related to Copilot and the cloud, and to consolidate data and tighten governance ShareGate: ShareGate can help with pre-migration analysis, configurable migration tool, and automated, end-user driven collaborative governance

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

sammart93

This presentation explores the impact of HTML injection attacks on web applications, detailing how attackers exploit vulnerabilities to inject malicious code into web pages. Learn about the potential consequences of such attacks and discover effective mitigation strategies to protect your web applications from HTML injection vulnerabilities. for more information visit https://bostoninstituteofanalytics.org/category/cyber-security-ethical-hacking/

HTML Injection Attacks: Impact and Mitigation Strategies

Boston Institute of Analytics

Kürzlich hochgeladen (20)

Boost PC performance: How more available memory can improve productivity

GenCyber Cyber Security Day Presentation

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

A Domino Admins Adventures (Engage 2024)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Histor y of HAM Radio presentation slide

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Advantages of Hiring UIUX Design Service Providers for Your Business

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Real Time Object Detection Using Open CV

How to Troubleshoot Apps for the Modern Connected Worker

GenAI Risks & Security Meetup 01052024.pdf

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Artificial Intelligence: Facts and Myths

The 7 Things I Know About Cyber Security After 25 Years | April 2024

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

What Are The Drone Anti-jamming Systems Technology?

Strategies for Landing an Oracle DBA Job as a Fresher

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

HTML Injection Attacks: Impact and Mitigation Strategies

Visual Information Retrieval in Endoscopic Video Archives

1. Visual Information Retrieval in Endoscopic Video Archives Jennifer Roldan Carlos, Mathias Lux, Xavier Giro-i-Nieto, Pia Munoz & Nektarios Anagnostopoulos

2. Motivation •  Surgery videos are taken every day •  Operations rooms are fully booked •  Many procedures already involve video •  Storing videos is / will be req. by law

3. Amount of Videos •  8-10 h operations / room and day •  say 6 hours excluding set ups, etc. •  5-6 days a week •  1,560 h video / year & OR

4. Use Case of Re-finding Frames •  Surgeons take „shots“ •  documentation, for patients, discussion •  Shots are intentionally framed •  and make for excellent representative images

5. Approach •  Temporal sampling: every 5th frame •  Indexing and search based on •  a set of global features •  or a localized global features

6. Late Fusion for Global Features

7. Features Employed •  Pyramid HOG •  extensive and large texture feature •  Color and Edge Directivity Descriptor •  compact and well performing joint histogram •  SIMPLE •  CEDD descriptors of patches at SURF key points

8. Data Set •  33 hours of video •  from actual procedures focusing on laporoscopy •  1,276 videos in total •  593,446 frames after temporal sampling

9. Example Results - SIMPLE

10. Evaluation – Re-Finding in Numbers •  Randomly selected more than 700 shots •  Excluding tests, white balance and out-of-patient •  Resulting in 600 sample queries

11. Evaluation – Re-Finding in Numbers •  Hypothesis I: every 5th frame is enough to re-find images. •  Hypothesis II: There is a noticeable difference between global and local features.

12. Evaluation – Re-Finding in Numbers

13. Evaluation – User Study •  Exploratory study, thinking aloud test •  Interactive web page presented to users •  ten cases with all available shots as queries •  three non-labeled search engines

14. Evaluation – User Study

15. Evaluation – User Study •  Population drawn from our projects •  experts in processing endoscopic videos •  well-aware of the requirements surgeons registered •  Task was to ... •  browse diverse results and •  voice drawbacks and benefits

16. Findings •  Sampling every 5th frame works (with headroom) •  Study participants noted that •  late fusion works as expected and yields interesting results besides near duplicates •  SIMPLE works better for semantically similar content, ie. translated instruments, etc.

17. Conclusions •  The system does not utilize •  domain dependent methods and heuristics •  run-time and storage demanding methods •  Still, it works out for the use case as a •  candidate support system for surgeons •  baseline to start on interactive video retrieval for laporoscopy.

18. Future Work •  Salient contours of images •  focus on being robust against lighting and noise

19. Future Work credits for feature & images: Chryssanthi Iakovidou

20. Future Work credits for feature & images: Chryssanthi Iakovidou

21. Time for questions? Mathias Lux ± Associate Professor @ Klagenfurt University, Austria mlux@itec.aau.at Thanks go to Jennifer Roldan Carlos, Xavier Giro-i-Nieto, Pia Munoz & Nektarios Anagnostopoulos

Visual Information Retrieval in Endoscopic Video Archives

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Visual Information Retrieval in Endoscopic Video Archives

Ähnlich wie Visual Information Retrieval in Endoscopic Video Archives (20)

Mehr von Universitat Politècnica de Catalunya

Mehr von Universitat Politècnica de Catalunya (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Visual Information Retrieval in Endoscopic Video Archives