o Developing methods to better interpret DNA microarray data. Within the last few years there has been a dramatic rise in the quantity and quality of genetic information being gathered on humans, animals, plants and micro-organisms. One of the most popular types of genetic information being gathered is the DNA microarray. In short, a DNA microarray allows the investigator to measure the activity level of every gene within a cell or organism of interest. Statistically, however, problems occur because typically there are thousands of genes (variables) measured and only a few observations (samples) on each gene. Thus, the number of variables is vastly larger than the number of samples, which, consequently, makes many traditional statistical methods impractical. In this summer's research we will collaborate with an inter-disciplinary research team of faculty and students to develop statistical methods of imposing biological (metabolic) structure on DNA microarray data to reduce its dimensionality and, ultimately, improve microarray data interpretation. Part of this project will involve developing novel statistical methods, another part will involve the analysis of real experimental microarray data.
Background: Students should have had at least 2 semesters of Calculus and at least one course in probability and/or statistics. Experience computer programming would be helpful though not essential.
o
More powerful sample designs when classification errors are present. Estimates in the presence of an imperfect classification mechanism
(e.g. a diagnostic test) are biased. Similarly, hypothesis tests on
these biased estimates have increased Type II error rates. A popular
sample design technique to address bias from imperfect classification
is a two-phase design. In a two-phase design all sample units are
classified with the imperfect classification mechanism, and a subsample
is classified using a perfect (gold-standard) classifier. Previous
research by (Tintle et al. 2007, and Tintle et al. (in preparation))
shows that when the gold standard is expensive, it can be cost
effective to reclassify a portion of sample using the imperfect
classifier: a sample design we call reclassification. In this summer's
research we will extend these results to one or more of the following
cases: (1) Investigating if it is ever cost-effective to classify
sample units 3 or more times using the imperfect classifier (2)
Considering if conditional reclassification is ever cost-effective
(that is reclassifying at different rates depending on what category
was observed after the first classification and (3) Comparing the
reclassification sample design to other diagnostic testing sample
designs like pooled testing.
Background: Students should have had at
least 2 semesters of Calculus and at least one course in probability
and/or statistics. Experience computer programming would be helpful
though not essential.
o
Students in my research group will investigate questions at the
interface of algebra and topology. The exact nature of the project may
be more algebraic or more topological depending on interest, but
typically the problems we will explore will be algebraic questions whose
answers shed light on questions in topology.
Recent REU research projects have focused on representation theory. A
representation of a group produces matrices that correspond to the
elements of the group and obey the same relations as the group. Some of
these representations are basic in the sense that any representation of
the group is comprised of these basic representations, known as
irreducible representations. The irreducible representations are thus
analogous to prime numbers: just as any number may be decomposed into
its unique product of primes, any representation of a group may be
decomposed into its irreducible representations. We have developed a
geometric model for a class of groups known as metacyclic groups, and
from the geometric model we are able to obtain many, but not all,
irreducible representations of these groups. It is well known by those
who know it well that the number of irreducible representations of a
group equals the number of conjugacy classes in the group, but the exact
relationship between the conjugacy classes and irreducible
representations is not well understood.
There are several unanswered questions in this area that students might
explore this summer. First, can the model be modified in some way to
account for all irreducible representations? Second, can the model be
extended to other classes of groups? Third, can the model(s) be used
to establish a correspondence between conjugacy classes and irreducible
representations? Fourth, how much structure do the geometric models
incorporate? For example, how much structure in the geometric model is
preserved by Adams operations on the representation ring, for instance?
Background: a semester of linear algebra and a semester of
abstract algebra; additional coursework in algebra and familiarity with
computer algebra systems such as Maple are beneficial, but not necessary.
o Given a function f: X -> X, and a point x in X, the orbit of x is the sequence {x, f(x), f(f(x)), . . . }. When a computer is used to generate an orbit, an actual orbit is never obtained due to round-off error. Instead, we get what is called a pseudo-orbit. Given a positive number, d, a d-pseudo-orbit is a sequence of points {x1, x2, x3, . . . } in X with the property that the distance from f(x1) to x2 is less than d, the distance from f(x2) to x3 is less than d, and so on. A d-pseudo-orbit then is the mathematical equivalent of a "sloppy orbit" generated by a computer.This then raises the question, given a pseudo-orbit, is it close to an actual orbit? That is, given some small positive number e, is there a small enough number d so that the terms of the d-pseudo-orbit all be within e of the terms of an orbit. If so, then we say that the pseudo-orbit is e-shadowed by the actual orbit. If every d-pseudo-orbit can be e-shadowed, then we say that f has the shadowing property. This is an area of dynamical systems that continues to generate interest and papers. I have two published papers - researched and written with undergraduates - that provide results for dynamical systems based on the unit interval. One of them gives necessary and sufficient conditions for an increasing function to have the shadowing property. Left to consider is finding conditions for which arbitrary continuous functions on the unit interval have the shadowing property. Another possible question for a given dynamical system is to ask what is the probability that a given pseudo-orbit is e-shadowed.This summer, I plan to continue researching which functions - on the unit interval or otherwise - have the shadowing property. Students will read various papers on the subject, and then we will decide what particular question we want to pursue. The full technical description of the project is available here.
Background: This research will relay heavily on concepts such as metric spaces, uniform continuity, and equicontinuity. Students should ideally have a background of a year-long course in undergraduate analysis. A background in probability and some computer programing may also be helpful.
o A Probabilistic Approach to Lights Out. In the mid 1990's an electronic puzzle called ``Light's Out'' became popular. The puzzle consists of a grid of lights, and at the start, some of these lights are on. Pressing one of the lights will toggle it and the four lights adjacent to it on and off. (Diagonal neighbors are not affected.) To win, one must switch off all of the lights. This puzzle can be reformulated in terms of graph theory with vertices representing lights and edges representing adjacency. One can then study graph theoretic properties of the puzzle, as well as optimal winning strategies and ``always winnable'' graphs.
This project will deal with adding a probabilistic element to the puzzle. One question relates to the expected number of moves required to win under a strategy of randomly pressing vertices on a winnable graph. Another avenue of research would involve assigning probabilities to various edges, so that lights are only toggled on or off with a certain probability when any vertex is pressed. The study of these questions may lead to other interesting variations on this theme.
Background: Students should have taken a college course involving linear algebra at the sophomore level or higher. Computer expertise (Maple or Mathematica) and expertise in basic probability would be helpful, but not essential.
Follow this link to our main REU page.