Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

JIGSAWS Surgical Dataset: Kinematic & Manual Annotations for Procedures (90 characters), Study notes of Medicine

The JIGSAWS dataset, captured using the da Vinci Surgical System, includes kinematic and manual annotations of surgical gestures and skill levels for eight surgeons performing three elementary surgical tasks. The goal is to improve surgical patient care by studying surgical motion and developing applications for teaching and assessing skillful motion to surgical trainees.

What you will learn

  • What surgical tasks are included in the JIGSAWS dataset?
  • How many subjects are included in the JIGSAWS dataset and what is their robotic surgical experience?
  • What types of data are included in the JIGSAWS dataset and how are they used?

Typology: Study notes

2021/2022

Uploaded on 09/27/2022

teap1x
teap1x 🇺🇸

4.7

(17)

231 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
JHU-ISI Gesture and Skill Assessment Working
Set (JIGSAWS): A Surgical Activity Dataset for
Human Motion Modeling
Yixin Gao1, S. Swaroop Vedula1, Carol E. Reiley1, Narges Ahmidi1,
Balakrishnan Varadarajan2, Henry C. Lin3, Lingling Tao1, Luca Zappella4,
Benjam´ın ejar5, David D. Yuh6, Chi Chiung Grace Chen7, Ren´e Vidal1,
Sanjeev Khudanpur1and Gregory D. Hager1
1Johns Hopkins University, Baltimore, MD 21210
2Google Inc., Mountain View, CA 94043
3Intuitive Surgical, Inc., Sunnyvale, CA 94086
4Metaio GmbH, 80335 Munich, Germany
5Swiss Federal Institute of Technology in Lausanne, 1015 Lausanne, Switzerland
6Yale University School of Medicine, New Haven, CT 06520
7Johns Hopkins Medical Institutions, Baltimore, MD 21224
Abstract. Dexterous surgical activity is of interest to many researchers
in human motion modeling. In this paper, we describe a dataset of sur-
gical activities and release it for public use. The dataset was captured
using the da Vinci Surgical System and consists of kinematic and video
from eight surgeons with different levels of skill performing five repe-
titions of three elementary surgical tasks on a bench-top model. The
tasks, which include suturing, knot-tying and needle-passing, are stan-
dard components of most surgical skills training curricula. In addition to
kinematic and video data captured from the da Vinci Surgical System,
we are also releasing manual annotations of surgical gestures (atomic
activity segments), surgical skill using global rating scores, a standard-
ized cross-validation experimental setup, and a C++/Matlab toolkits for
analyzing surgical gestures using hidden Markov models and using lin-
ear dynamical systems. We refer to the dataset as the JHU-ISI Gesture
and Skill Assessment Working Set (JIGSAWS) to indicate the collabo-
ration between Johns Hopkins University (JHU) and Intuitive Surgical
Inc. (ISI), Sunnyvale, CA, on collecting these data.
1 Introduction
Studying dexterous human motion is important for at least three reasons. First,
insights on how humans acquire dexterity in motion can be applied to facili-
tate skill acquisition. Second, skill may be objectively assessed using automated
technology. Third, dexterous human motion may be partially or completely auto-
mated. Partial automation involves human machine collaboration where humans
and robots perform separate parts of the task, for example [8].
Surgery involves dexterous human motion. The eventual goal for studying
surgical motion is to improve safety and effectiveness of surgical patient care.
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download JIGSAWS Surgical Dataset: Kinematic & Manual Annotations for Procedures (90 characters) and more Study notes Medicine in PDF only on Docsity!

JHU-ISI Gesture and Skill Assessment Working

Set (JIGSAWS): A Surgical Activity Dataset for

Human Motion Modeling

Yixin Gao^1 , S. Swaroop Vedula^1 , Carol E. Reiley^1 , Narges Ahmidi^1 , Balakrishnan Varadarajan^2 , Henry C. Lin^3 , Lingling Tao^1 , Luca Zappella^4 , Benjam´ın B´ejar^5 , David D. Yuh^6 , Chi Chiung Grace Chen^7 , Ren´e Vidal^1 , Sanjeev Khudanpur^1 and Gregory D. Hager^1 (^1) Johns Hopkins University, Baltimore, MD 21210 (^2) Google Inc., Mountain View, CA 94043 (^3) Intuitive Surgical, Inc., Sunnyvale, CA 94086 (^4) Metaio GmbH, 80335 Munich, Germany (^5) Swiss Federal Institute of Technology in Lausanne, 1015 Lausanne, Switzerland (^6) Yale University School of Medicine, New Haven, CT 06520 (^7) Johns Hopkins Medical Institutions, Baltimore, MD 21224

Abstract. Dexterous surgical activity is of interest to many researchers in human motion modeling. In this paper, we describe a dataset of sur- gical activities and release it for public use. The dataset was captured using the da Vinci Surgical System and consists of kinematic and video from eight surgeons with different levels of skill performing five repe- titions of three elementary surgical tasks on a bench-top model. The tasks, which include suturing, knot-tying and needle-passing, are stan- dard components of most surgical skills training curricula. In addition to kinematic and video data captured from the da Vinci Surgical System, we are also releasing manual annotations of surgical gestures (atomic activity segments), surgical skill using global rating scores, a standard- ized cross-validation experimental setup, and a C++/Matlab toolkits for analyzing surgical gestures using hidden Markov models and using lin- ear dynamical systems. We refer to the dataset as the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS) to indicate the collabo- ration between Johns Hopkins University (JHU) and Intuitive Surgical Inc. (ISI), Sunnyvale, CA, on collecting these data.

1 Introduction

Studying dexterous human motion is important for at least three reasons. First, insights on how humans acquire dexterity in motion can be applied to facili- tate skill acquisition. Second, skill may be objectively assessed using automated technology. Third, dexterous human motion may be partially or completely auto- mated. Partial automation involves human machine collaboration where humans and robots perform separate parts of the task, for example [8]. Surgery involves dexterous human motion. The eventual goal for studying surgical motion is to improve safety and effectiveness of surgical patient care.

Poor surgical technical skill has been shown to be associated with higher post- surgical patient complications including death [3]. Surgical technical errors were the most common reason for post-surgical complications including re-operation and re-admission [9]. Thus, improving how surgeons acquire technical skill can positively impact safety and effectiveness of surgical patient care.

1.1 The Language of Surgery project

We believe that surgical motion is analogous to human language because it is a composition of elementary activities that are sequentially performed with certain constraints. Consequently, surgical motion can be modeled using techniques that have successfully been applied for analyzing human language and speech. We define the Language of Surgery as a systematic description of surgical activities or proceedings in terms of constituents and rules of composition. Based upon [6], the language of surgical motion involves describing specific actions that surgeons perform with their instruments or hands to achieve an intended surgical goal. We study surgical activity as an example of dexterous human motion within the Language of Surgery project at the Johns Hopkins University. The overall goals for the project are to establish archives of surgical motion datasets, proce- dures and protocols to curate and securely store and share the datasets, develop and evaluate models to analyze surgical motion data, develop applications that use our models for teaching and assessing skillful motion to surgical trainees, and conduct research towards human machine collaboration in surgery. Surgical activity data may be considered to encompass several types of vari- ables related to human activity during surgery such as surgical tool motion (kine- matics), video, log of events happening within and beyond the surgical field, and other variables such as surgeon’s posture, speech, or manual annotations. The objective for this paper is to release and describe a dataset we compiled within one of our studies on skilled human motion where surgeons performed elementary tasks on a bench-top model in the laboratory using the da Vinci Surgical System [5] (dVSS; Intuitive Surgical, Inc., Sunnyvale, CA). The surgi- cal tasks included in this dataset are typically part of a surgical skills training curriculum. We refer to the dataset being released as the “JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS)” to indicate that this dataset was collected through a collaborative project between the Johns Hopkins University (JHU) and Intuitive Surgical Inc. (ISI).

1.2 The dVSS and its research interface

Within the Language of Surgery project, several studies have been conducted where we captured surgical activity data using the Application Programming Interface (API) of the dVSS. The dVSS is a tele-robotic surgical system that provides surgeons with enhanced dexterity, precision and control. The dVSS has been widely used to perform minimally invasive procedures in urology, gynecol- ogy, general surgery, and cardiothoracic surgery [5]. The dVSS is comprised of a master-side console with two master tool manipulators (MTMs) that are oper- ated by the surgeon, a patient-side robot with three patient-side manipulators

reported having more than 100 hours, subjects B, G, H, and I reported having fewer than 10 hours, and subjects C and F reported between 10 and 100 hours of robotic surgical experience. All subjects were reportedly right-handed.

2.3 Task repetitions

All subjects repeated each surgical task five times. We refer to each of the five instances of a study task performed by a subject as a “trial”. Each trial is on average about two minutes in duration for all three surgical tasks. We as- signed each trial a unique identifier in the form of “Task UidRep”. For example, “Knot Tying B001” in the dataset files indicates the first repetition of subject ”B” for the knot tying task. The JIGSAWS consists of 39 trials of SU, 36 trials of KT, and 28 trials of NP. The data for the remaining trials (1 for SU, 12 for NP, and 4 for KT) are unusable because of corrupted data recordings.

2.4 Data description

Kinematic data The JIGSAWS consists of three components: kinematic data, video data, and manual annotations. We captured the kinematic data from the dVSS using its API at 30 Hz. The left and right MTMs, and the first and second PSMs (PSM1 and PSM2, also referred as the right and left PSMs in this dataset), are included in the dataset. The motion of each manipulator was described by a local frame attached at the far end of the manipulator using 19 kinematic variables, therefore there are 76 dimensional data to describe the kinematics for all four manipulators listed above. The 19 kinematic variables for each manipulator include Cartesian positions (3 variables, denoted by xyz), a rotation matrix (9 variables, denoted by R), linear velocities (3 variables, denoted by x′y′z′), angular velocities (3 variables, denoted by α′β′γ′, where αβγ are Euler angles), and a gripper angle (denoted by θ). All kinematic variables are represented within a common coordinate system. Table 1 describes the details of the variables included in the kinematic dataset. The kinematic data for the MTMs, PSMs, and the video data were synchronized with the same sampling rate.

Table 1. Kinematic data variables.

Column indices Number of variables Description of variables 1-3 3 Left MTM tool tip position (xyz) 4-12 9 Left MTM tool tip rotation matrix (R) 13-15 3 Left MTM tool tip linear velocity (x′y′z′) 16-18 3 Left MTM tool tip rotational velocity (α′β′γ′) 19 1 Left MTM gripper angle velocity (θ) 20-38 19 Right MTM kinematics 39-41 3 PSM1 tool tip position (xyz) 42-50 9 PSM1 tool tip rotation matrix (R) 51-53 3 PSM1 tool tip linear velocity (x′y′z′) 54-56 3 PSM1 tool tip rotational velocity (α′β′γ′) 57 1 PSM1 gripper angle velocity (θ) 58-76 19 PSM2 kinematics

Video data We captured stereo video from both endoscopic cameras of the dVSS at 30 Hz and at a resolution of 640 x 480. The video and kinematic data were synchronized such that each video frame corresponds to a kinematic data frame captured at the same instant of time. The videos in the dataset being released are saved as AVI files encoded in four character code (FOURCC)

Fig. 2. Screenshots of the corresponding left and right images for a single frame in the suturing task.

format with the DX codec. The video files named “capture1” and “capture2” were recorded from the left and right en- doscopic cameras, respec- tively. Figure 2 shows a snapshot of correspond- ing left and right images for a single frame. The dataset does not include calibration parameters for the two endoscopic cameras.

2.5 Manual annotations

Surgical activity annotation A key feature of the JIGSAWS is the manually annotated ground-truth for atomic surgical activity segments called “gestures” or “surgemes” [6,11]. A surgical gesture is defined as an atomic unit of inten- tional surgical activity resulting in a perceivable and meaningful outcome. We specified a common vocabulary comprised of 15 elements as detailed in table 2, to describe gestures for all three tasks in the dataset through consultation with an experienced cardiac surgeon with an established robotic surgical practice.

Table 2. Gesture vocabulary

Gesture index Gesture description G1 Reaching for needle with right hand G2 Positioning needle G3 Pushing needle through tissue G4 Transferring needle from left to right G5 Moving to center with needle in grip G6 Pulling suture with left hand G7 Pulling suture with right hand G8 Orienting needle G9 Using right hand to help tighten suture G10 Loosening more suture G11 Dropping suture at end and moving to end points G12 Reaching for needle with left hand G13 Making C loop around right hand G14 Reaching for suture with right hand G15 Pulling suture with both hands.

Table 3. Elements of modified global rating score

Element Rating scale

Respect for tissue

1-Frequently used unnecessary force on tissue; 3-Careful tissue handling but occasionally caused inad- vertent damage; 5-Consistent appropriate tissue handling;

Suture/needle handling

1-Awkward and unsure with repeated entanglement and poor knot tying; 3-Majority of knots placed correctly with appropriate tension; 5-Excellent suture control

Time and motion

1-Made unnecessary moves; 3-Efficient time/motion but some unnecessary moves; 5-Clear economy of movement and maximum efficiency

Flow of operation

1-Frequently interrupted flow to discuss the next move; 3-Demonstrated some forward planning and reasonable procedure progression; 5-Obviously planned course of operation with efficient transitions between moves;

Overall performance

1-Very poor; 3-Competent; 5-Clearly Superior;

Quality of final product

1-Very poor; 3-Competent; 5-Clearly Superior;

  1. A C++/Matlab toolkit that can be used to analyze the kinematic data or video data. The toolkit contains three main tools: - A grammar-constrained hidden Markov model (HMM) to analyze the kinematic data: This tool implements an HMM framework with the states modeled as Gaussian mixture models, as described in [17], where each gesture can be modeled using one elementary HMM and a trial is modeled as a composite HMM by concatenating the constituent gesture- specific elementary HMMs. Grammatical constraints can be applied on decoding to simplify the search. - A sparse-representation based HMM to analyze the kinematic data: This tool implements an HMM with the observations modeled as a sparse linear combination of atoms from a dictionary, as described in [14]. In contrast to the grammar-constrained HMM tool, here the gestures are modeled as the states in an HMM, and each trial can be viewed as an instance from an HMM. - A multiple kernel learning framework with linear dynamical system (LDS) and bag-of-features (BoF) to analyze the video data: This tool imple- ments the three methods described in [2,18]: 1) using an LDS to model a video clip corresponding to a surgical gesture, where the observations are the video frames; 2) a BoF approach where a dictionary of visual words is learned and each image is represented by a histogram over the

dictionary; 3) a multiple kernel learning framework to optimally combine the LDS and BoF approaches.

4 Logistics

4.1 Accessing the dataset

The JIGSAWS can be downloaded without a fee from the Language of Surgery website: http://cirl.lcsr.jhu.edu/jigsaws. Access to the dataset requires a free registration, including provision of a valid email address so that we can inform investigators about any issues or corrections to the dataset.

4.2 Data organization

The JIGSAWS is available for download as zip files. Users are free to choose among tasks and data types, including kinematics or videos, together with tran- scriptions for surgical gestures annotations and a meta file of self-proclaimed skill and overall OSATS score for each tasks. In addition, users can optionally download the experimental setup with standardized cross-validation folds we used in our analyses, and a C++/Matlab toolkit using this experimental setup for surgical gesture modeling. Partitioned downloads are enabled.

5 Prior work with the JIGSAWS

The JIGSAWS has been used for several research studies with two major areas of focus - surgical activity recognition and skill assessment. Techniques related to speech recognition, control theory, and computer vision have been developed and applied in the research studies using this dataset. We worked with various features extracted from the dataset, for example, features composed of a subset raw kinematics or converted kinematics, features extracted from videos, or fea- tures combining information from both kinematics and video, to classify (assign class label with known segment boundaries), or recognize (assign class label with unknown segment boundaries) surgical activities, and to evaluate dexterous skill. Below, we briefly summarize the prior work with the JIGSAWS. In an early study, [12] performed gesture classification using a 3-state elemen- tary HMM to model each gesture in a trial after applying linear discriminant analysis (LDA) to the continuous kinematic data for dimensionality reduction. Later on, [11] applied an HMM on discrete spectrum features extracted from a short time Fourier transform of the velocity features in the kinematic data with the goal of evaluating surgical skill in tasks and sub-tasks. In [10] studied the relation between subtasks and skill using the same features. [17] accom- plished a surgical recognition task (jointly identifying gesture boundaries and assigning gesture labels) by applying an HMM to features derived from model- ing the kinematic data (after an optional LDA) using Gaussian mixture models (GMMs). The topology for HMMs in [17] was entirely data-derived. These meth- ods were enhanced in [16] through the development of various statistical models

  1. J. D. Birkmeyer, J. F. Finks, A. O’Reilly, M. Oerline, A. M. Carlin, A. R. Nunn, J. Dimick, M. Banerjee, and N. J. Birkmeyer. Surgical skill and complication rates after bariatric surgery. New England Journal of Medicine, 369(15):1434–1442, 2013.
  2. S. DiMaio and C. Hasser. The da Vinci research interface. The MIDAS Journal
    • Systems and Architectures for Computer Assisted Interventions (MICCAI 2008 Workshop), 2008.
  3. G. Guthart and J. Salisbury, J. The IntuitiveTMtelesurgery system: Overview and application. In IEEE International Conference on Robotics and Automation, volume 1, pages 618–621 vol.1, 2000.
  4. H. C. Lin. Structure in Surgical Motion. dissertation, Johns Hopkins University,
  5. J. Martin, G. Regehr, R. Reznick, H. MacRae, J. Murnaghan, C. Hutchison, and M. Brown. Objective structured assessment of technical skill (OSATS) for surgical residents. British Journal of Surgery, 84:273–278, 1997.
  6. N. Padoy and G. Hager. Human-machine collaborative surgery using learned mod- els. In IEEE International Conference on Robotics and Automation, pages 5285– 5292, May 2011.
  7. S. E. Regenbogen, C. C. Greenberg, D. M. Studdert, S. R. Lipsitz, M. J. Zinner, and A. A. Gawande. Patterns of technical error among surgical malpractice claims: an analysis of strategies to prevent injury to surgical patients. Ann. Surg., 246(5):705– 711, Nov 2007.
  8. C. E. Reiley and G. D. Hager. Decomposition of robotic surgical tasks: An analysis of subtasks and their correlation to skill. In Modeling and Monitoring of Computer Assisted Interventions (M2CAI) - MICCAI Workshop, 2009.
  9. C. E. Reiley and G. D. Hager. Task versus subtask surgical skill evaluation of robotic minimally invasive surgery. In Medical Image Computing and Computer- Assisted Intervention–MICCAI 2009, pages 435–442. Elsevier, 2009.
  10. C. E. Reiley, H. C. Lin, B. Varadarajan, S. Khudanpur, D. D. Yuh, and G. D. Hager. Automatic recognition of surgical motions using statistical modeling for capturing variability. In Medicine Meets Virtual Reality, volume 132, pages 396– 401, 2008.
  11. C. E. Reiley, E. Plaku, and G. D. Hager. Motion generation of robotic surgi- cal tasks: Learning from expert demonstrations. In Engineering in Medicine and Biology. Elsevier, 2010.
  12. L. Tao, E. Elhamifar, S. Khudanpur, G. D. Hager, and R. Vidal. Sparse hidden markov models for surgical gesture classification and skill evaluation. In Informa- tion Processing in Computer-Assisted Interventions, volume 7330, pages 167–177. Springer Berlin Heidelberg, 2012.
  13. L. Tao, L. Zappella, G. D. Hager, and R. Vidal. Surgical gesture segmentation and recognition. In Medical Image Computing and Computer-Assisted Interven- tion–MICCAI 2013, Nagoya, Japan, 2013. Springer.
  14. B. Varadarajan. Learning and Inference Algorithms for Dynamic System Models of Dextrous Motion. dissertation, Johns Hopkins University, 2011.
  15. B. Varadarajan, C. E. Reiley, H. C. Lin, S. Khudanpur, and G. D. Hager. Data- derived models for segmentation with application to surgical assessment and train- ing. In Medical Image Computing and Computer-Assisted Intervention –MICCAI 2009 , pages 426–434. Elsevier, 2009.
  16. L. Zappella, B. B´ejar, G. D. Hager, and R. Vidal. Surgical gesture classification from video and kinematic data. Medical Image Analysis, 17:732 – 745, 2013.