Medical data signature extraction using modified TF-IDF in DataBridge project Public Deposited

Downloadable Content

Download PDF
Last Modified
  • February 28, 2019
Creator
  • Hu, Wei
    • Affiliation: School of Information and Library Science
Abstract
  • This project is a part of the DataBridge project, where we try to find similar datasets among a large number of medical datasets stored in the DataBridge server using key words extraction and similarity algorithms. In this project, a sample of 1,000 datasets were randomly chosen from the 18,000 datasets corpus. Modified TF-IDF was used in the sample data to generate key words for the 1,000 datasets and similarity analysis was followed. According to the results, we find that the key words extraction works fine in calculating similarities between different datasets.
Date of publication
Subject
Resource type
Rights statement
  • In Copyright
Advisor
  • Rajasekar, Arcot
Degree
  • Master of Science in Information Science
Degree granting institution
  • University of North Carolina at Chapel Hill
Extent
  • 40
Deposit record
  • 30f8261f-89ed-44cf-9b2a-ab48eda760ef
Parents:

This work has no parents.

Items