ingest
cdrApp
2017-07-06T13:00:37.770Z
ccd64451-f0fc-4a42-94ad-226f4041fa4f
modifyDatastreamByValue
RELS-EXT
cdrApp
2017-07-06T13:17:54.277Z
Setting exclusive relation
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2018-01-25T05:01:37.519Z
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2018-01-27T05:33:37.451Z
modifyDatastreamByValue
RELS-EXT
fedoraAdmin
2018-02-08T12:50:51.511Z
Setting exclusive relation
addDatastream
MD_TECHNICAL
fedoraAdmin
2018-02-08T12:51:02.263Z
Adding technical metadata derived by FITS
addDatastream
MD_FULL_TEXT
fedoraAdmin
2018-02-08T12:51:18.192Z
Adding full text metadata extracted by Apache Tika
modifyDatastreamByValue
RELS-EXT
fedoraAdmin
2018-02-08T12:51:39.624Z
Setting exclusive relation
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2018-03-14T01:44:47.719Z
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2018-05-17T13:36:42.959Z
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2018-07-11T00:13:41.557Z
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2018-07-17T20:13:52.450Z
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2018-08-08T19:40:46.096Z
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2018-08-15T16:49:33.882Z
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2018-08-16T19:52:24.075Z
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2018-09-21T17:17:57.151Z
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2018-09-26T20:29:21.554Z
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2018-10-11T21:06:32.683Z
modifyDatastreamByValue
MD_DESCRIPTIVE
cdrApp
2019-03-20T14:27:07.016Z
Jonathan
Hibbard
Author
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation.
Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects.
We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine.
Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial).
This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement.
Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference.
We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
Spring 2017
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting institution
Biostatistics
Michael
Kosorok
Thesis advisor
text
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting
heterogeneity, with focus on actual implementation. Advances in data-mining and big-data
methods have allowed new and exciting opportunities to alter the precise nature of
statistical medical research. Whereas traditional science experimentation has attempted to
eliminate causes of variability beyond a small set of variables of interest to be
investigated, machine-learning techniques to extract weak and complex signals from noisy
data now allow handling of heterogeneous experiments and subjects. We propose that viewed
through the lens of these modern machine-learning methods, heterogeneous and
highly-variable data should be regarded as a boon not a nuisance. In particular such data
allows for the investigation and construction of individualized treatment rules for
patients, that is for the advance of precision medicine. Two facets of this view are
especially explored. Firstly the practical design and implementation of appropriate data
collection experiments allowing for a machine-learning approach, whilst simultaneously
permitting a traditional experimental view in order to satisfy investigators from both
paradigms. We reference a particular example, the design for a clinical trial
investigating the optimal treatment of burns patients (the LIBERTI trial). This example
highlights some particular challenges, statistical, philosophical and logistical, and
hopefully some corresponding solutions, that arise when bridging traditional and modern
paradigms. Whilst we present our design as an initial solution, from the attempted
implementation of this trial we discover, and then explore, particular aspects that are
apt for further improvement. Secondly we investigate methods to combine and make effective
traditional clustering techniques in higher dimensional data with weak signals, where
existing techniques may fail. Motivated by an example of COPD sufferers’ data (the
SPIROMICS study), we attempt to develop ways combining more traditional methods with a
machine-learning approach, and more fuzzy data-mining methods, with ones permitting better
inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast
Cancer data set. We explore extensions of the traditional Gaussian mixture model to more
general log-concave distributions and highlight what should be interesting theory for such
approximations.
Spring 2017
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting
institution
Biostatistics
Michael
Kosorok
Thesis advisor
text
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation. Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects. We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine. Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial). This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement. Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
Spring 2017
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting institution
Biostatistics
Michael
Kosorok
Thesis advisor
text
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation. Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects. We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine. Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial). This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement. Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
2017-05
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting institution
Biostatistics
Michael
Kosorok
Thesis advisor
text
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation. Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects. We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine. Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial). This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement. Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting institution
Biostatistics
Michael
Kosorok
Thesis advisor
text
2017-05
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation. Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects. We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine. Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial). This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement. Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting institution
Biostatistics
Michael
Kosorok
Thesis advisor
text
2017-05
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation. Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects. We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine. Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial). This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement. Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting institution
Biostatistics
Michael
Kosorok
Thesis advisor
text
2017-05
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation. Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects. We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine. Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial). This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement. Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting institution
Biostatistics
Michael
Kosorok
Thesis advisor
text
2017-05
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation. Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects. We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine. Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial). This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement. Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
Biostatistics
Michael
Kosorok
Thesis advisor
text
2017-05
University of North Carolina at Chapel Hill
Degree granting institution
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation. Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects. We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine. Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial). This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement. Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting institution
Biostatistics
Michael
Kosorok
Thesis advisor
text
2017-05
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation. Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects. We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine. Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial). This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement. Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting institution
Biostatistics
Michael
Kosorok
Thesis advisor
text
2017-05
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation. Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects. We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine. Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial). This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement. Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting institution
Biostatistics
Michael
Kosorok
Thesis advisor
text
2017-05
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation. Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects. We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine. Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial). This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement. Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting institution
Biostatistics
Michael
Kosorok
Thesis advisor
text
2017-05
Jonathan
Hibbard
Creator
Department of Biostatistics
Gillings School of Global Public Health
Harnessing Heterogeneity To Improve Patient Outcomes.
We investigate methods of improving medical outcomes through exploiting heterogeneity, with focus on actual implementation. Advances in data-mining and big-data methods have allowed new and exciting opportunities to alter the precise nature of statistical medical research. Whereas traditional science experimentation has attempted to eliminate causes of variability beyond a small set of variables of interest to be investigated, machine-learning techniques to extract weak and complex signals from noisy data now allow handling of heterogeneous experiments and subjects. We propose that viewed through the lens of these modern machine-learning methods, heterogeneous and highly-variable data should be regarded as a boon not a nuisance. In particular such data allows for the investigation and construction of individualized treatment rules for patients, that is for the advance of precision medicine. Two facets of this view are especially explored. Firstly the practical design and implementation of appropriate data collection experiments allowing for a machine-learning approach, whilst simultaneously permitting a traditional experimental view in order to satisfy investigators from both paradigms. We reference a particular example, the design for a clinical trial investigating the optimal treatment of burns patients (the LIBERTI trial). This example highlights some particular challenges, statistical, philosophical and logistical, and hopefully some corresponding solutions, that arise when bridging traditional and modern paradigms. Whilst we present our design as an initial solution, from the attempted implementation of this trial we discover, and then explore, particular aspects that are apt for further improvement. Secondly we investigate methods to combine and make effective traditional clustering techniques in higher dimensional data with weak signals, where existing techniques may fail. Motivated by an example of COPD sufferers’ data (the SPIROMICS study), we attempt to develop ways combining more traditional methods with a machine-learning approach, and more fuzzy data-mining methods, with ones permitting better inference. We illustrate our methods on Fisher's Iris data, and the Wisconsin Breast Cancer data set. We explore extensions of the traditional Gaussian mixture model to more general log-concave distributions and highlight what should be interesting theory for such approximations.
2017
Biostatistics
Statistics
eng
Doctor of Philosophy
Dissertation
University of North Carolina at Chapel Hill Graduate School
Degree granting institution
Michael
Kosorok
Thesis advisor
text
2017-05
Hibbard_unc_0153D_17035.pdf
uuid:68a8e4b4-709a-44e2-b0a6-e8b0865c21b9
2017-05-01T17:31:20Z
proquest
2019-07-06T00:00:00
yes
application/pdf
6949438