Is data science mathematically interesting?Finding correlation in large data, non-numeric setsStatistical Data Analysiscomparing multiple contingency tables, independent dataWhen to throw away dataUses of bisimulation outside of computer science.Computer Science for MathematiciansHow to estimate sample mean and variance from derived dataHigher categories as data structuresCan pure mathematics harness citizen science?
Is data science mathematically interesting?
Finding correlation in large data, non-numeric setsStatistical Data Analysiscomparing multiple contingency tables, independent dataWhen to throw away dataUses of bisimulation outside of computer science.Computer Science for MathematiciansHow to estimate sample mean and variance from derived dataHigher categories as data structuresCan pure mathematics harness citizen science?
$begingroup$
I have seen a plethora of job advertisements in the last few years on mathjobs.org for academic positions in data science. Now I understand why economic pressures would cause this to happen, but from a traditional view of university organization, but how does data science fit in?
I would have guessed that at most, a research group having to do with something labeled "data science" could be formed as an interdisciplinary project between applied mathematicians, statisticians, and computer scientists, with corporate funding. But I don't see why it is a fundamentally distinct intellectual endeavor, prompting mathematics hires specifically in data science.
The first time I heard the term “data science,” it was said that they wanted to take PhD’s who had experience in statistically analyzing large data sets, and train them for a few weeks to apply these skills to marketing and advertising. Now just a few years later people want to hire professors of this.
Question: What about data science is particularly interesting from a mathematical point of view?
st.statistics computer-science gm.general-mathematics
$endgroup$
|
show 4 more comments
$begingroup$
I have seen a plethora of job advertisements in the last few years on mathjobs.org for academic positions in data science. Now I understand why economic pressures would cause this to happen, but from a traditional view of university organization, but how does data science fit in?
I would have guessed that at most, a research group having to do with something labeled "data science" could be formed as an interdisciplinary project between applied mathematicians, statisticians, and computer scientists, with corporate funding. But I don't see why it is a fundamentally distinct intellectual endeavor, prompting mathematics hires specifically in data science.
The first time I heard the term “data science,” it was said that they wanted to take PhD’s who had experience in statistically analyzing large data sets, and train them for a few weeks to apply these skills to marketing and advertising. Now just a few years later people want to hire professors of this.
Question: What about data science is particularly interesting from a mathematical point of view?
st.statistics computer-science gm.general-mathematics
$endgroup$
7
$begingroup$
Why "at most"? You could also say, and historically people did say: "why have an academic position in computer science rather than organize interdisciplinary collaboration between math and electrical engineering?" The idea of a distinct intellectual endeavor can follow the creation of a department.
$endgroup$
– Matt F.
Oct 2 at 17:09
6
$begingroup$
@MattF. Well this is why I’m asking the question! There is an answer to give for computer science. How about data science?
$endgroup$
– Monroe Eskew
Oct 2 at 17:18
9
$begingroup$
Even if some mathematicians do not find data science interesting...there are still people (including me) who do data science, and find it interesting, and use some math, and have things in the area to teach students, which the students find valuable, and which are as testable and certifiable as anything else in academia.That and the economic incentives together justify departments of data science to me.
$endgroup$
– Matt F.
Oct 2 at 17:46
1
$begingroup$
There are some good answers here, so just a comment. A key part of data science is use of neural nets. Given a problem, we need to define a net with a suitable architecture, train it, and voila, we are done. Only problem: what is "suitable". Currently this is done more by trial and error than on the basis of any proper theory.
$endgroup$
– Keith
Oct 4 at 6:27
2
$begingroup$
Much as we would like to think that different academic departments represent distinct intellectual endeavors (perhaps in some Platonic heaven where true knowledge is neatly separated into disjoint buckets?), I think the reality is that pragmatic considerations dominate. In practice it is hard to get tenure and promotion doing only interdisciplinary work, so it tends to be neglected in favor of activities that earn more respect. If you want some area to get a lot of attention, then the best strategy is usually to create a new department, regardless of what things look like in Platonic heaven.
$endgroup$
– Timothy Chow
Oct 4 at 15:58
|
show 4 more comments
$begingroup$
I have seen a plethora of job advertisements in the last few years on mathjobs.org for academic positions in data science. Now I understand why economic pressures would cause this to happen, but from a traditional view of university organization, but how does data science fit in?
I would have guessed that at most, a research group having to do with something labeled "data science" could be formed as an interdisciplinary project between applied mathematicians, statisticians, and computer scientists, with corporate funding. But I don't see why it is a fundamentally distinct intellectual endeavor, prompting mathematics hires specifically in data science.
The first time I heard the term “data science,” it was said that they wanted to take PhD’s who had experience in statistically analyzing large data sets, and train them for a few weeks to apply these skills to marketing and advertising. Now just a few years later people want to hire professors of this.
Question: What about data science is particularly interesting from a mathematical point of view?
st.statistics computer-science gm.general-mathematics
$endgroup$
I have seen a plethora of job advertisements in the last few years on mathjobs.org for academic positions in data science. Now I understand why economic pressures would cause this to happen, but from a traditional view of university organization, but how does data science fit in?
I would have guessed that at most, a research group having to do with something labeled "data science" could be formed as an interdisciplinary project between applied mathematicians, statisticians, and computer scientists, with corporate funding. But I don't see why it is a fundamentally distinct intellectual endeavor, prompting mathematics hires specifically in data science.
The first time I heard the term “data science,” it was said that they wanted to take PhD’s who had experience in statistically analyzing large data sets, and train them for a few weeks to apply these skills to marketing and advertising. Now just a few years later people want to hire professors of this.
Question: What about data science is particularly interesting from a mathematical point of view?
st.statistics computer-science gm.general-mathematics
st.statistics computer-science gm.general-mathematics
edited Oct 2 at 17:57
community wiki
Monroe Eskew
7
$begingroup$
Why "at most"? You could also say, and historically people did say: "why have an academic position in computer science rather than organize interdisciplinary collaboration between math and electrical engineering?" The idea of a distinct intellectual endeavor can follow the creation of a department.
$endgroup$
– Matt F.
Oct 2 at 17:09
6
$begingroup$
@MattF. Well this is why I’m asking the question! There is an answer to give for computer science. How about data science?
$endgroup$
– Monroe Eskew
Oct 2 at 17:18
9
$begingroup$
Even if some mathematicians do not find data science interesting...there are still people (including me) who do data science, and find it interesting, and use some math, and have things in the area to teach students, which the students find valuable, and which are as testable and certifiable as anything else in academia.That and the economic incentives together justify departments of data science to me.
$endgroup$
– Matt F.
Oct 2 at 17:46
1
$begingroup$
There are some good answers here, so just a comment. A key part of data science is use of neural nets. Given a problem, we need to define a net with a suitable architecture, train it, and voila, we are done. Only problem: what is "suitable". Currently this is done more by trial and error than on the basis of any proper theory.
$endgroup$
– Keith
Oct 4 at 6:27
2
$begingroup$
Much as we would like to think that different academic departments represent distinct intellectual endeavors (perhaps in some Platonic heaven where true knowledge is neatly separated into disjoint buckets?), I think the reality is that pragmatic considerations dominate. In practice it is hard to get tenure and promotion doing only interdisciplinary work, so it tends to be neglected in favor of activities that earn more respect. If you want some area to get a lot of attention, then the best strategy is usually to create a new department, regardless of what things look like in Platonic heaven.
$endgroup$
– Timothy Chow
Oct 4 at 15:58
|
show 4 more comments
7
$begingroup$
Why "at most"? You could also say, and historically people did say: "why have an academic position in computer science rather than organize interdisciplinary collaboration between math and electrical engineering?" The idea of a distinct intellectual endeavor can follow the creation of a department.
$endgroup$
– Matt F.
Oct 2 at 17:09
6
$begingroup$
@MattF. Well this is why I’m asking the question! There is an answer to give for computer science. How about data science?
$endgroup$
– Monroe Eskew
Oct 2 at 17:18
9
$begingroup$
Even if some mathematicians do not find data science interesting...there are still people (including me) who do data science, and find it interesting, and use some math, and have things in the area to teach students, which the students find valuable, and which are as testable and certifiable as anything else in academia.That and the economic incentives together justify departments of data science to me.
$endgroup$
– Matt F.
Oct 2 at 17:46
1
$begingroup$
There are some good answers here, so just a comment. A key part of data science is use of neural nets. Given a problem, we need to define a net with a suitable architecture, train it, and voila, we are done. Only problem: what is "suitable". Currently this is done more by trial and error than on the basis of any proper theory.
$endgroup$
– Keith
Oct 4 at 6:27
2
$begingroup$
Much as we would like to think that different academic departments represent distinct intellectual endeavors (perhaps in some Platonic heaven where true knowledge is neatly separated into disjoint buckets?), I think the reality is that pragmatic considerations dominate. In practice it is hard to get tenure and promotion doing only interdisciplinary work, so it tends to be neglected in favor of activities that earn more respect. If you want some area to get a lot of attention, then the best strategy is usually to create a new department, regardless of what things look like in Platonic heaven.
$endgroup$
– Timothy Chow
Oct 4 at 15:58
7
7
$begingroup$
Why "at most"? You could also say, and historically people did say: "why have an academic position in computer science rather than organize interdisciplinary collaboration between math and electrical engineering?" The idea of a distinct intellectual endeavor can follow the creation of a department.
$endgroup$
– Matt F.
Oct 2 at 17:09
$begingroup$
Why "at most"? You could also say, and historically people did say: "why have an academic position in computer science rather than organize interdisciplinary collaboration between math and electrical engineering?" The idea of a distinct intellectual endeavor can follow the creation of a department.
$endgroup$
– Matt F.
Oct 2 at 17:09
6
6
$begingroup$
@MattF. Well this is why I’m asking the question! There is an answer to give for computer science. How about data science?
$endgroup$
– Monroe Eskew
Oct 2 at 17:18
$begingroup$
@MattF. Well this is why I’m asking the question! There is an answer to give for computer science. How about data science?
$endgroup$
– Monroe Eskew
Oct 2 at 17:18
9
9
$begingroup$
Even if some mathematicians do not find data science interesting...there are still people (including me) who do data science, and find it interesting, and use some math, and have things in the area to teach students, which the students find valuable, and which are as testable and certifiable as anything else in academia.That and the economic incentives together justify departments of data science to me.
$endgroup$
– Matt F.
Oct 2 at 17:46
$begingroup$
Even if some mathematicians do not find data science interesting...there are still people (including me) who do data science, and find it interesting, and use some math, and have things in the area to teach students, which the students find valuable, and which are as testable and certifiable as anything else in academia.That and the economic incentives together justify departments of data science to me.
$endgroup$
– Matt F.
Oct 2 at 17:46
1
1
$begingroup$
There are some good answers here, so just a comment. A key part of data science is use of neural nets. Given a problem, we need to define a net with a suitable architecture, train it, and voila, we are done. Only problem: what is "suitable". Currently this is done more by trial and error than on the basis of any proper theory.
$endgroup$
– Keith
Oct 4 at 6:27
$begingroup$
There are some good answers here, so just a comment. A key part of data science is use of neural nets. Given a problem, we need to define a net with a suitable architecture, train it, and voila, we are done. Only problem: what is "suitable". Currently this is done more by trial and error than on the basis of any proper theory.
$endgroup$
– Keith
Oct 4 at 6:27
2
2
$begingroup$
Much as we would like to think that different academic departments represent distinct intellectual endeavors (perhaps in some Platonic heaven where true knowledge is neatly separated into disjoint buckets?), I think the reality is that pragmatic considerations dominate. In practice it is hard to get tenure and promotion doing only interdisciplinary work, so it tends to be neglected in favor of activities that earn more respect. If you want some area to get a lot of attention, then the best strategy is usually to create a new department, regardless of what things look like in Platonic heaven.
$endgroup$
– Timothy Chow
Oct 4 at 15:58
$begingroup$
Much as we would like to think that different academic departments represent distinct intellectual endeavors (perhaps in some Platonic heaven where true knowledge is neatly separated into disjoint buckets?), I think the reality is that pragmatic considerations dominate. In practice it is hard to get tenure and promotion doing only interdisciplinary work, so it tends to be neglected in favor of activities that earn more respect. If you want some area to get a lot of attention, then the best strategy is usually to create a new department, regardless of what things look like in Platonic heaven.
$endgroup$
– Timothy Chow
Oct 4 at 15:58
|
show 4 more comments
6 Answers
6
active
oldest
votes
$begingroup$
I will stay away from the academic politics of hiring "professors of data science", but if I interpret the question more specifically as "does data science offer problems of mathematical interest", I might refer to Bandeira's list of 42 Open Problems in Mathematics of Data Science.
(The full list from 2016 is here, and Bandeira's home page links to solutions of some of these.)
$endgroup$
1
$begingroup$
Ha, this list includes some of my favorite open problems (the Komlos conjecture, matrix "six standard deviations suffice" theorem, constructive Kadison-Singer, determining the constant in Grothendieck's inequality), and I am a little puzzled that any of them has much to do with data science but hey..maybe that's how you get enough people to work on something so that you can see it solved.
$endgroup$
– Sasho Nikolov
Oct 4 at 18:12
add a comment
|
$begingroup$
Fundamentally a lot of what a modern data scientist does is very similar to what in previous generations would have been the responsibility of a statistician, and it shouldn't surprise you that there are professors of statistics. Mathematically there are quite a few interesting things that come up in a lot of modern data science, but first let me make a non-comprehensive taxonomy of the sub-areas of data science, because there are several different activities which "data science" includes:
Data Collection: this is largely a non-mathematical task where data is actually collected. There can be novel mathematical problems solved in this area if one is doing inference because the structure of the collection significantly effects the independence and sampling assumptions of a lot of methods, and that mathematics is usually done in the context of social science or more applied statistics. For example, "Causal Inference without Balance Checking" is a paper about the mathematics of dealing with non-random data collection in inference, written by two economists and a political scientist. The majority of this kind of work is not mathematical, and much more in the realm of computer scientists and social scientists.
Extraction, Transformation, and Loading (ETL). This is largely the domain of computer scientists, especially whenever you get into issues of "big-data" you are often times talking about running massively parallel algorithms on distributed systems. There is some mathematics that goes into this, even though it is largely not mathematical. For example in the area of Natural Language Processing a key part of this step might be to process words according to a topic model, the most common of which was described in this paper. The underlying model is deeply mathematical, being a baysian generative model, and the paper shows how this work (although done outside of a math department) is mathematical research.
Inference: This is the domain of classical statisticians, and is all about creating models and estimators from those models to learn something about the population you are sampling from. In the modern practice of data science there are plenty of people who are interested in inference, myself included, and who use the classical tools of statistics to get at it. Interestingly there is an abundance of subjects where the classical tools of inference have been reapproriated to new contexts for prediction. Most interestingly for inference is that there is a lot of new mathematics to be done in taking the new models we are using for prediction and making them usable for inference. For example "Consistency of Random Forests" takes a workhorse of data science and tried to understand its mathematical properties and move towards a place where the otherwise predictive model can be used for inference. Moreover, there is a lot of mathematical work on models utilized by data scientists asking when and how they can be used for an inferential task. The classic example is graphical models, where Judea Pearl's book delves precisely into this question.
Prediction: This is what the majority of industry data scientists spend the bulk of their time working on. Prediction is often thought about entirely empirically, meaning that very little mathematics goes into it and instead it is based largely on simulation or testing on real data. However, there is math to be done here, both in setting foundations, and in the fact that prediction can be easily re-framed as approximation, a classic topic in analysis. In fact there is a fundamental theorem in machine learning called the Universal Approximation Theorem which is in essence proving a fact about the density and the convex hull of a subspace of $L_2$.
With that groundwork for what data science is out of the way, here are some more specific mathematical issues at play:
Non Convex Optimization: one of the most common tasks in machine learning is to optimize over some non-convex function. One of the things data scientists wish to understand is the properties of these non-convex optimizations, especially because they are frequently used, but still being understood mathematically. "Non Convex Optimization for Machine Learning" is a monograph that tackles this exact problem, and is very approachable to even the non-mathematician.
Foundations: I know that when mathematicians think of foundations we often think of the esoteric, but actually in this context what I mean is that, because data science has developed so quickly as an applied discipline, it often discovers that certain models and techniques 'work' but there is quite a bit of mystery about why. For a good introduction into this kind of thinking you can look at a talk like "On the Connection between Neural Networks and Kernels" or a book like Foundations of Data Science by Blum, Hopcraft, and Kannon, which is an undergraduate textbook (so not too advanced) but if you have more training you can easily see some of the deeper issues. So much of data science is deeply rooted in functional analysis, and so I expect to see a lot of work coming from that direction in the future.
Generative Modeling: This is the problem of approximating a distribution. Clearly there is more traditional work in analysis about interpolation and functional approximation in given functional spaces, and there is also work in probability theory about precisely this problem. In addition to those two traditions, generative modeling also deals a lot with non-parametric estimation. For example the book "Distribution Free Theory of Non-Parametric Regression" is an interesting mathematical take on a lot of methods used classically in non-parametric statistics and by data science in generative modeling.
This is just a sampling of topics, for example I didn't even touch on reinforcement learning, and I think that as time goes on the language and literature surrounding data science will grow into a robust set of literature rooted firmly in analysis and proability theory (with smatterings of geometry and topology).
$endgroup$
$begingroup$
Also! If anyone works on this sort of thing I'd love if you let me know about your work :)
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:37
$begingroup$
Hi, thank you for this excellent summary, from which I learned quite a bit. I have a similar vision regarding a better language surrounding data science. I am interested in connections between information geometry, nonlinear filtering, and machine learning. Feel free to contact me if you are interested to learn more.
$endgroup$
– S.Surace
Oct 3 at 23:26
add a comment
|
$begingroup$
The Mathematics of Data may go some way towards answering your question. As one example of a mathematically interesting topic that is motivated by data science, you might want to look at the concept of persistent homology.
$endgroup$
$begingroup$
If I may split hairs, was it really "motivated by data science"? Afra Zomorodian was working on persistent homology years before PR people popularized the term "data science".
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 13:14
2
$begingroup$
@RodrigodeAzevedo : My understanding is that Zomorodian was motivated by computational questions in topology, which I consider to fall under the umbrella of "data science". It may be anachronistic to say that people were working on data science before the term "data science" was in vogue, but that doesn't bother me much. As an analogy, I'm happy to say that people were studying "linear algebra" in the 19th century (and perhaps even earlier) even if they didn't use that term at the time.
$endgroup$
– Timothy Chow
Oct 5 at 15:14
$begingroup$
Thank you for taking the time to provide such a detailed reply. Indeed, a field can exist before it has an official name. My (hairsplitter) argument was that Zomorodian may have viewed his work as topological data analysis, not as data science. Apparently, "analysis" was not authoritative enough. Or, perhaps, data science is a superset of data analysis.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:24
$begingroup$
By the way, this expository article may complement your answer nicely.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:31
add a comment
|
$begingroup$
To begin, there is a family of results which are sometimes referred to as "No free lunch" theorems. Each of these results, in their own way, asserts that any optimization algorithm is just as good as any other if you average over the space of all optimization problems. On the other hand, we know that in specific domains some algorithms vastly outperform all others (that we're aware of) - for detecting objects in images, convolutional neural networks are state of the art, and in computational linguistics the best you can do for most tasks is a neural network with an LSTM or transformer architecture. In both of these cases, the state of the art algorithms perform vastly better than, say, logistic regression.
How can we reconcile the "No free lunch" theorems with our empirical experience? The answer has to be that object detection in images and standard NLP tasks aren't "typical" optimization problems - some combination of the data and the task has some special structure which particular neural architectures are unusually good at detecting. What is this structure? Why are known algorithms so good at learning it? Can we generate new algorithms (neural or otherwise) that are even better?
These are all essentially math problems, sitting somewhere at the intersection of optimization theory and information theory. They are pretty wide open - except in simple cases like logistic regression - there isn't much in the way of theory which characterizes an algorithm as optimal for a particular task among the space of all possible optimization algorithms. An influential paper from 2014 proposes to use the theory of renormalization groups from physics to tackle this question, and there are other attempts using gauge theory or the principle of maximum entropy. Another line of attack involves the so-called "manifold hypothesis", which asserts that real world data sets (presented as a set of points in euclidean space) tend to cluster near a high codimension submanifold.
That's my answer for your main technical question, but I'll also make a remark about the academic politics. There are significantly more job openings in data science than there are people to fill them, so much so that many companies (like AirBnB) have found it cheaper and easier to start internal data science training programs than hire outside people. This problem is not likely to go away any time soon, so it's sensible to incentivize universities to start degree programs in the field, even if it's not yet a fully fleshed-out academic discipline. This has plenty of historical precedent - for instance, academic programs in forensic science and financial mathematics sprouted in the same way in the 1990's.
$endgroup$
$begingroup$
This is such an interesting framing!
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 23:11
1
$begingroup$
Very interesting answer, but the first paragraph seems to blur the distinction between an optimization algorithm and a classification algorithm. A convolutional neural network is not an optimization algorithm, for example.
$endgroup$
– littleO
Oct 4 at 3:51
1
$begingroup$
@littleO Sure it is! Rather, training a classification algorithm is. Deep neural networks, including CNN's, have a large number of parameters which specify how data moves between the neurons, and the process of training a neural network corresponds to finding - usually via some form of gradient descent - a collection of parameters which minimizes an objective function. In the case of classification problems the objective function is chosen to punish classification errors in training data - cross entropy is a typical choice. Most other classification algorithms can be viewed similarly.
$endgroup$
– Paul Siegel
Oct 4 at 5:34
$begingroup$
I know that one trains a neural network (for example) by using an optimization algorithm such as stochastic gradient descent; I was just making the point that there's a distinction between a classifier and the optimization algorithm which is used to train the classifier.
$endgroup$
– littleO
Oct 4 at 5:58
1
$begingroup$
@littleO Fair enough, I guess. For the purposes of this discussion "optimization algorithm" means an algorithm which takes as input a function defined on a finite set of points in some space and produces as output a probability distribution which "best approximates" the input function in an appropriate sense. It was not intended to refer to the actual numerical analysis used to construct the probability distribution. I tried to suppress these details deliberately because the same remarks apply to a broader class of data science problems than just classification.
$endgroup$
– Paul Siegel
Oct 4 at 6:33
|
show 1 more comment
$begingroup$
I think the problem is that "data science" means many different things to different people. To you it connotes applying statistics to marketing, but for others it covers large swaths of probability, statistics, machine learning, even things like geometry, etc.
But this can be an opportunity too. If I wade into the politics of hiring just a little and also interpret your question as "why would data science professors be principled additions to math departments?" ... well, if a department can secure a line of funding for "data science", they may not choose to hire a marketing or advertising person, they might choose to fill it with a probabilist or so on.
$endgroup$
2
$begingroup$
This is too rosy: a hire in data science is unlikely to be or to call themselves a probabilist. The first search page of google.com/… turns up one "amateur probabilist", two references to Leo Breiman who died in 2005, and several inapt uses of the word "probabilist".
$endgroup$
– Matt F.
Oct 3 at 21:05
$begingroup$
@MattF. I agree getting a probabilist in particular for such a position is a stretch.
$endgroup$
– usul
Oct 3 at 22:23
$begingroup$
@MattF While you might be right, I also rarely hear the word "probabilist" and hear "probability theorist" much more.
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:26
$begingroup$
Yes, but searching for “probability theorist” + “data science” turns up: 1) Cosma Shalizi, a Carnegie-Mellon statistician, saying his advisor was a probability theorist; 2) Scott Sheffield, an MIT probability theorist, at his courtesy appointment in the data science program that he does not bother to mention on his home page; 3) Robert Wolpert, a data scientist whose career was set by considering but not taking a class from a noted probability theorist; and 4+) lists with disjoint sublists of data scientists and probability theorists. So that connection too is mostly historical.
$endgroup$
– Matt F.
Oct 4 at 2:32
add a comment
|
$begingroup$
I studied data science in the past. My personal experience from studying data science is that about 80% of it was not really interesting. So it really depends on what kind of sub-modules you're studying within that field. It may be just because of my personal taste in learning, it might be different for others though. But I would like to mention the following:
What is data? This is really something that is very difficult to answer. Many people have different definitions of data. But it really boils down to information. Data = information. Then we can ask, what is information? Without me going into to many details I can mention that a computer uses bits of information. A bit can have one of two states. zero or one. But you probably allready knew that. Another way is that we can say that information (or data) is collection of objects (collection of bits). How these individual units can behave in different complex ways, or in simple ways. But data can be so much more. It can be higher layers of data also. This could even be an own research subject.
What is science? Well, science is everywhere. Science has an history. Scientific things have been invented and are being research on today and in the past. I will not say much more because almost everyone has an notion of what science is.
So when we put these two things together data and science, we have something that is really difficult to actually describe in a concrete way. One suggestion is to look at what data science subjects are in curriculum programs on different universities and schools etc. What you will find there is most likely different kinds of subjects. So it is a broad field. You should focus on narrower ones, find the subjects that you personally like or what you are good at. So there are different subjects to learn in different universities. For example at my university, we learned Java instead of C/C++. So what programming languages you learn could be different. Of course this is not really all about data or information, but a programming language is a tool one can use to make programs that processes data information.
--- Edit: you can do scientific research on the data you have using that programming language. ---
Programming languages has some similarities and some differences. But I think its the similarities that is important to learn, but some teachers are not really focusing on that aspect, only learn away what the "specific" language can do.
Personally, I believe to learn pure mathematics and/or the field of boolean algebra, maybe a bit of electronics and such stuff is quite fruitful and I think some areas need people who knows a little about everything and alot about a few concrete subject areas.
To conclude, data science is interesting if you look at it at some point where you are focused and deep into some problem or subject. In schools and such I did not find data science very interesting because it was very divided up into modules that I really did not care much about. It was some very difficult concepts that are only used in places where big systems and perhaps networking systems, I did not have much interest in moving big data around. I was more interested in what elements they are made up of and smaller units that can be processed into a whole. There so much more to say, I have just scratched the surface of my own personal experience and knowledge about this issue. I tried to not be to formal and use to many complex words.
$endgroup$
add a comment
|
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "504"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathoverflow.net%2fquestions%2f342966%2fis-data-science-mathematically-interesting%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
6 Answers
6
active
oldest
votes
6 Answers
6
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
I will stay away from the academic politics of hiring "professors of data science", but if I interpret the question more specifically as "does data science offer problems of mathematical interest", I might refer to Bandeira's list of 42 Open Problems in Mathematics of Data Science.
(The full list from 2016 is here, and Bandeira's home page links to solutions of some of these.)
$endgroup$
1
$begingroup$
Ha, this list includes some of my favorite open problems (the Komlos conjecture, matrix "six standard deviations suffice" theorem, constructive Kadison-Singer, determining the constant in Grothendieck's inequality), and I am a little puzzled that any of them has much to do with data science but hey..maybe that's how you get enough people to work on something so that you can see it solved.
$endgroup$
– Sasho Nikolov
Oct 4 at 18:12
add a comment
|
$begingroup$
I will stay away from the academic politics of hiring "professors of data science", but if I interpret the question more specifically as "does data science offer problems of mathematical interest", I might refer to Bandeira's list of 42 Open Problems in Mathematics of Data Science.
(The full list from 2016 is here, and Bandeira's home page links to solutions of some of these.)
$endgroup$
1
$begingroup$
Ha, this list includes some of my favorite open problems (the Komlos conjecture, matrix "six standard deviations suffice" theorem, constructive Kadison-Singer, determining the constant in Grothendieck's inequality), and I am a little puzzled that any of them has much to do with data science but hey..maybe that's how you get enough people to work on something so that you can see it solved.
$endgroup$
– Sasho Nikolov
Oct 4 at 18:12
add a comment
|
$begingroup$
I will stay away from the academic politics of hiring "professors of data science", but if I interpret the question more specifically as "does data science offer problems of mathematical interest", I might refer to Bandeira's list of 42 Open Problems in Mathematics of Data Science.
(The full list from 2016 is here, and Bandeira's home page links to solutions of some of these.)
$endgroup$
I will stay away from the academic politics of hiring "professors of data science", but if I interpret the question more specifically as "does data science offer problems of mathematical interest", I might refer to Bandeira's list of 42 Open Problems in Mathematics of Data Science.
(The full list from 2016 is here, and Bandeira's home page links to solutions of some of these.)
answered Oct 2 at 17:43
community wiki
Carlo Beenakker
1
$begingroup$
Ha, this list includes some of my favorite open problems (the Komlos conjecture, matrix "six standard deviations suffice" theorem, constructive Kadison-Singer, determining the constant in Grothendieck's inequality), and I am a little puzzled that any of them has much to do with data science but hey..maybe that's how you get enough people to work on something so that you can see it solved.
$endgroup$
– Sasho Nikolov
Oct 4 at 18:12
add a comment
|
1
$begingroup$
Ha, this list includes some of my favorite open problems (the Komlos conjecture, matrix "six standard deviations suffice" theorem, constructive Kadison-Singer, determining the constant in Grothendieck's inequality), and I am a little puzzled that any of them has much to do with data science but hey..maybe that's how you get enough people to work on something so that you can see it solved.
$endgroup$
– Sasho Nikolov
Oct 4 at 18:12
1
1
$begingroup$
Ha, this list includes some of my favorite open problems (the Komlos conjecture, matrix "six standard deviations suffice" theorem, constructive Kadison-Singer, determining the constant in Grothendieck's inequality), and I am a little puzzled that any of them has much to do with data science but hey..maybe that's how you get enough people to work on something so that you can see it solved.
$endgroup$
– Sasho Nikolov
Oct 4 at 18:12
$begingroup$
Ha, this list includes some of my favorite open problems (the Komlos conjecture, matrix "six standard deviations suffice" theorem, constructive Kadison-Singer, determining the constant in Grothendieck's inequality), and I am a little puzzled that any of them has much to do with data science but hey..maybe that's how you get enough people to work on something so that you can see it solved.
$endgroup$
– Sasho Nikolov
Oct 4 at 18:12
add a comment
|
$begingroup$
Fundamentally a lot of what a modern data scientist does is very similar to what in previous generations would have been the responsibility of a statistician, and it shouldn't surprise you that there are professors of statistics. Mathematically there are quite a few interesting things that come up in a lot of modern data science, but first let me make a non-comprehensive taxonomy of the sub-areas of data science, because there are several different activities which "data science" includes:
Data Collection: this is largely a non-mathematical task where data is actually collected. There can be novel mathematical problems solved in this area if one is doing inference because the structure of the collection significantly effects the independence and sampling assumptions of a lot of methods, and that mathematics is usually done in the context of social science or more applied statistics. For example, "Causal Inference without Balance Checking" is a paper about the mathematics of dealing with non-random data collection in inference, written by two economists and a political scientist. The majority of this kind of work is not mathematical, and much more in the realm of computer scientists and social scientists.
Extraction, Transformation, and Loading (ETL). This is largely the domain of computer scientists, especially whenever you get into issues of "big-data" you are often times talking about running massively parallel algorithms on distributed systems. There is some mathematics that goes into this, even though it is largely not mathematical. For example in the area of Natural Language Processing a key part of this step might be to process words according to a topic model, the most common of which was described in this paper. The underlying model is deeply mathematical, being a baysian generative model, and the paper shows how this work (although done outside of a math department) is mathematical research.
Inference: This is the domain of classical statisticians, and is all about creating models and estimators from those models to learn something about the population you are sampling from. In the modern practice of data science there are plenty of people who are interested in inference, myself included, and who use the classical tools of statistics to get at it. Interestingly there is an abundance of subjects where the classical tools of inference have been reapproriated to new contexts for prediction. Most interestingly for inference is that there is a lot of new mathematics to be done in taking the new models we are using for prediction and making them usable for inference. For example "Consistency of Random Forests" takes a workhorse of data science and tried to understand its mathematical properties and move towards a place where the otherwise predictive model can be used for inference. Moreover, there is a lot of mathematical work on models utilized by data scientists asking when and how they can be used for an inferential task. The classic example is graphical models, where Judea Pearl's book delves precisely into this question.
Prediction: This is what the majority of industry data scientists spend the bulk of their time working on. Prediction is often thought about entirely empirically, meaning that very little mathematics goes into it and instead it is based largely on simulation or testing on real data. However, there is math to be done here, both in setting foundations, and in the fact that prediction can be easily re-framed as approximation, a classic topic in analysis. In fact there is a fundamental theorem in machine learning called the Universal Approximation Theorem which is in essence proving a fact about the density and the convex hull of a subspace of $L_2$.
With that groundwork for what data science is out of the way, here are some more specific mathematical issues at play:
Non Convex Optimization: one of the most common tasks in machine learning is to optimize over some non-convex function. One of the things data scientists wish to understand is the properties of these non-convex optimizations, especially because they are frequently used, but still being understood mathematically. "Non Convex Optimization for Machine Learning" is a monograph that tackles this exact problem, and is very approachable to even the non-mathematician.
Foundations: I know that when mathematicians think of foundations we often think of the esoteric, but actually in this context what I mean is that, because data science has developed so quickly as an applied discipline, it often discovers that certain models and techniques 'work' but there is quite a bit of mystery about why. For a good introduction into this kind of thinking you can look at a talk like "On the Connection between Neural Networks and Kernels" or a book like Foundations of Data Science by Blum, Hopcraft, and Kannon, which is an undergraduate textbook (so not too advanced) but if you have more training you can easily see some of the deeper issues. So much of data science is deeply rooted in functional analysis, and so I expect to see a lot of work coming from that direction in the future.
Generative Modeling: This is the problem of approximating a distribution. Clearly there is more traditional work in analysis about interpolation and functional approximation in given functional spaces, and there is also work in probability theory about precisely this problem. In addition to those two traditions, generative modeling also deals a lot with non-parametric estimation. For example the book "Distribution Free Theory of Non-Parametric Regression" is an interesting mathematical take on a lot of methods used classically in non-parametric statistics and by data science in generative modeling.
This is just a sampling of topics, for example I didn't even touch on reinforcement learning, and I think that as time goes on the language and literature surrounding data science will grow into a robust set of literature rooted firmly in analysis and proability theory (with smatterings of geometry and topology).
$endgroup$
$begingroup$
Also! If anyone works on this sort of thing I'd love if you let me know about your work :)
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:37
$begingroup$
Hi, thank you for this excellent summary, from which I learned quite a bit. I have a similar vision regarding a better language surrounding data science. I am interested in connections between information geometry, nonlinear filtering, and machine learning. Feel free to contact me if you are interested to learn more.
$endgroup$
– S.Surace
Oct 3 at 23:26
add a comment
|
$begingroup$
Fundamentally a lot of what a modern data scientist does is very similar to what in previous generations would have been the responsibility of a statistician, and it shouldn't surprise you that there are professors of statistics. Mathematically there are quite a few interesting things that come up in a lot of modern data science, but first let me make a non-comprehensive taxonomy of the sub-areas of data science, because there are several different activities which "data science" includes:
Data Collection: this is largely a non-mathematical task where data is actually collected. There can be novel mathematical problems solved in this area if one is doing inference because the structure of the collection significantly effects the independence and sampling assumptions of a lot of methods, and that mathematics is usually done in the context of social science or more applied statistics. For example, "Causal Inference without Balance Checking" is a paper about the mathematics of dealing with non-random data collection in inference, written by two economists and a political scientist. The majority of this kind of work is not mathematical, and much more in the realm of computer scientists and social scientists.
Extraction, Transformation, and Loading (ETL). This is largely the domain of computer scientists, especially whenever you get into issues of "big-data" you are often times talking about running massively parallel algorithms on distributed systems. There is some mathematics that goes into this, even though it is largely not mathematical. For example in the area of Natural Language Processing a key part of this step might be to process words according to a topic model, the most common of which was described in this paper. The underlying model is deeply mathematical, being a baysian generative model, and the paper shows how this work (although done outside of a math department) is mathematical research.
Inference: This is the domain of classical statisticians, and is all about creating models and estimators from those models to learn something about the population you are sampling from. In the modern practice of data science there are plenty of people who are interested in inference, myself included, and who use the classical tools of statistics to get at it. Interestingly there is an abundance of subjects where the classical tools of inference have been reapproriated to new contexts for prediction. Most interestingly for inference is that there is a lot of new mathematics to be done in taking the new models we are using for prediction and making them usable for inference. For example "Consistency of Random Forests" takes a workhorse of data science and tried to understand its mathematical properties and move towards a place where the otherwise predictive model can be used for inference. Moreover, there is a lot of mathematical work on models utilized by data scientists asking when and how they can be used for an inferential task. The classic example is graphical models, where Judea Pearl's book delves precisely into this question.
Prediction: This is what the majority of industry data scientists spend the bulk of their time working on. Prediction is often thought about entirely empirically, meaning that very little mathematics goes into it and instead it is based largely on simulation or testing on real data. However, there is math to be done here, both in setting foundations, and in the fact that prediction can be easily re-framed as approximation, a classic topic in analysis. In fact there is a fundamental theorem in machine learning called the Universal Approximation Theorem which is in essence proving a fact about the density and the convex hull of a subspace of $L_2$.
With that groundwork for what data science is out of the way, here are some more specific mathematical issues at play:
Non Convex Optimization: one of the most common tasks in machine learning is to optimize over some non-convex function. One of the things data scientists wish to understand is the properties of these non-convex optimizations, especially because they are frequently used, but still being understood mathematically. "Non Convex Optimization for Machine Learning" is a monograph that tackles this exact problem, and is very approachable to even the non-mathematician.
Foundations: I know that when mathematicians think of foundations we often think of the esoteric, but actually in this context what I mean is that, because data science has developed so quickly as an applied discipline, it often discovers that certain models and techniques 'work' but there is quite a bit of mystery about why. For a good introduction into this kind of thinking you can look at a talk like "On the Connection between Neural Networks and Kernels" or a book like Foundations of Data Science by Blum, Hopcraft, and Kannon, which is an undergraduate textbook (so not too advanced) but if you have more training you can easily see some of the deeper issues. So much of data science is deeply rooted in functional analysis, and so I expect to see a lot of work coming from that direction in the future.
Generative Modeling: This is the problem of approximating a distribution. Clearly there is more traditional work in analysis about interpolation and functional approximation in given functional spaces, and there is also work in probability theory about precisely this problem. In addition to those two traditions, generative modeling also deals a lot with non-parametric estimation. For example the book "Distribution Free Theory of Non-Parametric Regression" is an interesting mathematical take on a lot of methods used classically in non-parametric statistics and by data science in generative modeling.
This is just a sampling of topics, for example I didn't even touch on reinforcement learning, and I think that as time goes on the language and literature surrounding data science will grow into a robust set of literature rooted firmly in analysis and proability theory (with smatterings of geometry and topology).
$endgroup$
$begingroup$
Also! If anyone works on this sort of thing I'd love if you let me know about your work :)
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:37
$begingroup$
Hi, thank you for this excellent summary, from which I learned quite a bit. I have a similar vision regarding a better language surrounding data science. I am interested in connections between information geometry, nonlinear filtering, and machine learning. Feel free to contact me if you are interested to learn more.
$endgroup$
– S.Surace
Oct 3 at 23:26
add a comment
|
$begingroup$
Fundamentally a lot of what a modern data scientist does is very similar to what in previous generations would have been the responsibility of a statistician, and it shouldn't surprise you that there are professors of statistics. Mathematically there are quite a few interesting things that come up in a lot of modern data science, but first let me make a non-comprehensive taxonomy of the sub-areas of data science, because there are several different activities which "data science" includes:
Data Collection: this is largely a non-mathematical task where data is actually collected. There can be novel mathematical problems solved in this area if one is doing inference because the structure of the collection significantly effects the independence and sampling assumptions of a lot of methods, and that mathematics is usually done in the context of social science or more applied statistics. For example, "Causal Inference without Balance Checking" is a paper about the mathematics of dealing with non-random data collection in inference, written by two economists and a political scientist. The majority of this kind of work is not mathematical, and much more in the realm of computer scientists and social scientists.
Extraction, Transformation, and Loading (ETL). This is largely the domain of computer scientists, especially whenever you get into issues of "big-data" you are often times talking about running massively parallel algorithms on distributed systems. There is some mathematics that goes into this, even though it is largely not mathematical. For example in the area of Natural Language Processing a key part of this step might be to process words according to a topic model, the most common of which was described in this paper. The underlying model is deeply mathematical, being a baysian generative model, and the paper shows how this work (although done outside of a math department) is mathematical research.
Inference: This is the domain of classical statisticians, and is all about creating models and estimators from those models to learn something about the population you are sampling from. In the modern practice of data science there are plenty of people who are interested in inference, myself included, and who use the classical tools of statistics to get at it. Interestingly there is an abundance of subjects where the classical tools of inference have been reapproriated to new contexts for prediction. Most interestingly for inference is that there is a lot of new mathematics to be done in taking the new models we are using for prediction and making them usable for inference. For example "Consistency of Random Forests" takes a workhorse of data science and tried to understand its mathematical properties and move towards a place where the otherwise predictive model can be used for inference. Moreover, there is a lot of mathematical work on models utilized by data scientists asking when and how they can be used for an inferential task. The classic example is graphical models, where Judea Pearl's book delves precisely into this question.
Prediction: This is what the majority of industry data scientists spend the bulk of their time working on. Prediction is often thought about entirely empirically, meaning that very little mathematics goes into it and instead it is based largely on simulation or testing on real data. However, there is math to be done here, both in setting foundations, and in the fact that prediction can be easily re-framed as approximation, a classic topic in analysis. In fact there is a fundamental theorem in machine learning called the Universal Approximation Theorem which is in essence proving a fact about the density and the convex hull of a subspace of $L_2$.
With that groundwork for what data science is out of the way, here are some more specific mathematical issues at play:
Non Convex Optimization: one of the most common tasks in machine learning is to optimize over some non-convex function. One of the things data scientists wish to understand is the properties of these non-convex optimizations, especially because they are frequently used, but still being understood mathematically. "Non Convex Optimization for Machine Learning" is a monograph that tackles this exact problem, and is very approachable to even the non-mathematician.
Foundations: I know that when mathematicians think of foundations we often think of the esoteric, but actually in this context what I mean is that, because data science has developed so quickly as an applied discipline, it often discovers that certain models and techniques 'work' but there is quite a bit of mystery about why. For a good introduction into this kind of thinking you can look at a talk like "On the Connection between Neural Networks and Kernels" or a book like Foundations of Data Science by Blum, Hopcraft, and Kannon, which is an undergraduate textbook (so not too advanced) but if you have more training you can easily see some of the deeper issues. So much of data science is deeply rooted in functional analysis, and so I expect to see a lot of work coming from that direction in the future.
Generative Modeling: This is the problem of approximating a distribution. Clearly there is more traditional work in analysis about interpolation and functional approximation in given functional spaces, and there is also work in probability theory about precisely this problem. In addition to those two traditions, generative modeling also deals a lot with non-parametric estimation. For example the book "Distribution Free Theory of Non-Parametric Regression" is an interesting mathematical take on a lot of methods used classically in non-parametric statistics and by data science in generative modeling.
This is just a sampling of topics, for example I didn't even touch on reinforcement learning, and I think that as time goes on the language and literature surrounding data science will grow into a robust set of literature rooted firmly in analysis and proability theory (with smatterings of geometry and topology).
$endgroup$
Fundamentally a lot of what a modern data scientist does is very similar to what in previous generations would have been the responsibility of a statistician, and it shouldn't surprise you that there are professors of statistics. Mathematically there are quite a few interesting things that come up in a lot of modern data science, but first let me make a non-comprehensive taxonomy of the sub-areas of data science, because there are several different activities which "data science" includes:
Data Collection: this is largely a non-mathematical task where data is actually collected. There can be novel mathematical problems solved in this area if one is doing inference because the structure of the collection significantly effects the independence and sampling assumptions of a lot of methods, and that mathematics is usually done in the context of social science or more applied statistics. For example, "Causal Inference without Balance Checking" is a paper about the mathematics of dealing with non-random data collection in inference, written by two economists and a political scientist. The majority of this kind of work is not mathematical, and much more in the realm of computer scientists and social scientists.
Extraction, Transformation, and Loading (ETL). This is largely the domain of computer scientists, especially whenever you get into issues of "big-data" you are often times talking about running massively parallel algorithms on distributed systems. There is some mathematics that goes into this, even though it is largely not mathematical. For example in the area of Natural Language Processing a key part of this step might be to process words according to a topic model, the most common of which was described in this paper. The underlying model is deeply mathematical, being a baysian generative model, and the paper shows how this work (although done outside of a math department) is mathematical research.
Inference: This is the domain of classical statisticians, and is all about creating models and estimators from those models to learn something about the population you are sampling from. In the modern practice of data science there are plenty of people who are interested in inference, myself included, and who use the classical tools of statistics to get at it. Interestingly there is an abundance of subjects where the classical tools of inference have been reapproriated to new contexts for prediction. Most interestingly for inference is that there is a lot of new mathematics to be done in taking the new models we are using for prediction and making them usable for inference. For example "Consistency of Random Forests" takes a workhorse of data science and tried to understand its mathematical properties and move towards a place where the otherwise predictive model can be used for inference. Moreover, there is a lot of mathematical work on models utilized by data scientists asking when and how they can be used for an inferential task. The classic example is graphical models, where Judea Pearl's book delves precisely into this question.
Prediction: This is what the majority of industry data scientists spend the bulk of their time working on. Prediction is often thought about entirely empirically, meaning that very little mathematics goes into it and instead it is based largely on simulation or testing on real data. However, there is math to be done here, both in setting foundations, and in the fact that prediction can be easily re-framed as approximation, a classic topic in analysis. In fact there is a fundamental theorem in machine learning called the Universal Approximation Theorem which is in essence proving a fact about the density and the convex hull of a subspace of $L_2$.
With that groundwork for what data science is out of the way, here are some more specific mathematical issues at play:
Non Convex Optimization: one of the most common tasks in machine learning is to optimize over some non-convex function. One of the things data scientists wish to understand is the properties of these non-convex optimizations, especially because they are frequently used, but still being understood mathematically. "Non Convex Optimization for Machine Learning" is a monograph that tackles this exact problem, and is very approachable to even the non-mathematician.
Foundations: I know that when mathematicians think of foundations we often think of the esoteric, but actually in this context what I mean is that, because data science has developed so quickly as an applied discipline, it often discovers that certain models and techniques 'work' but there is quite a bit of mystery about why. For a good introduction into this kind of thinking you can look at a talk like "On the Connection between Neural Networks and Kernels" or a book like Foundations of Data Science by Blum, Hopcraft, and Kannon, which is an undergraduate textbook (so not too advanced) but if you have more training you can easily see some of the deeper issues. So much of data science is deeply rooted in functional analysis, and so I expect to see a lot of work coming from that direction in the future.
Generative Modeling: This is the problem of approximating a distribution. Clearly there is more traditional work in analysis about interpolation and functional approximation in given functional spaces, and there is also work in probability theory about precisely this problem. In addition to those two traditions, generative modeling also deals a lot with non-parametric estimation. For example the book "Distribution Free Theory of Non-Parametric Regression" is an interesting mathematical take on a lot of methods used classically in non-parametric statistics and by data science in generative modeling.
This is just a sampling of topics, for example I didn't even touch on reinforcement learning, and I think that as time goes on the language and literature surrounding data science will grow into a robust set of literature rooted firmly in analysis and proability theory (with smatterings of geometry and topology).
answered Oct 3 at 22:20
community wiki
Juan Sebastian Lozano
$begingroup$
Also! If anyone works on this sort of thing I'd love if you let me know about your work :)
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:37
$begingroup$
Hi, thank you for this excellent summary, from which I learned quite a bit. I have a similar vision regarding a better language surrounding data science. I am interested in connections between information geometry, nonlinear filtering, and machine learning. Feel free to contact me if you are interested to learn more.
$endgroup$
– S.Surace
Oct 3 at 23:26
add a comment
|
$begingroup$
Also! If anyone works on this sort of thing I'd love if you let me know about your work :)
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:37
$begingroup$
Hi, thank you for this excellent summary, from which I learned quite a bit. I have a similar vision regarding a better language surrounding data science. I am interested in connections between information geometry, nonlinear filtering, and machine learning. Feel free to contact me if you are interested to learn more.
$endgroup$
– S.Surace
Oct 3 at 23:26
$begingroup$
Also! If anyone works on this sort of thing I'd love if you let me know about your work :)
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:37
$begingroup$
Also! If anyone works on this sort of thing I'd love if you let me know about your work :)
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:37
$begingroup$
Hi, thank you for this excellent summary, from which I learned quite a bit. I have a similar vision regarding a better language surrounding data science. I am interested in connections between information geometry, nonlinear filtering, and machine learning. Feel free to contact me if you are interested to learn more.
$endgroup$
– S.Surace
Oct 3 at 23:26
$begingroup$
Hi, thank you for this excellent summary, from which I learned quite a bit. I have a similar vision regarding a better language surrounding data science. I am interested in connections between information geometry, nonlinear filtering, and machine learning. Feel free to contact me if you are interested to learn more.
$endgroup$
– S.Surace
Oct 3 at 23:26
add a comment
|
$begingroup$
The Mathematics of Data may go some way towards answering your question. As one example of a mathematically interesting topic that is motivated by data science, you might want to look at the concept of persistent homology.
$endgroup$
$begingroup$
If I may split hairs, was it really "motivated by data science"? Afra Zomorodian was working on persistent homology years before PR people popularized the term "data science".
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 13:14
2
$begingroup$
@RodrigodeAzevedo : My understanding is that Zomorodian was motivated by computational questions in topology, which I consider to fall under the umbrella of "data science". It may be anachronistic to say that people were working on data science before the term "data science" was in vogue, but that doesn't bother me much. As an analogy, I'm happy to say that people were studying "linear algebra" in the 19th century (and perhaps even earlier) even if they didn't use that term at the time.
$endgroup$
– Timothy Chow
Oct 5 at 15:14
$begingroup$
Thank you for taking the time to provide such a detailed reply. Indeed, a field can exist before it has an official name. My (hairsplitter) argument was that Zomorodian may have viewed his work as topological data analysis, not as data science. Apparently, "analysis" was not authoritative enough. Or, perhaps, data science is a superset of data analysis.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:24
$begingroup$
By the way, this expository article may complement your answer nicely.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:31
add a comment
|
$begingroup$
The Mathematics of Data may go some way towards answering your question. As one example of a mathematically interesting topic that is motivated by data science, you might want to look at the concept of persistent homology.
$endgroup$
$begingroup$
If I may split hairs, was it really "motivated by data science"? Afra Zomorodian was working on persistent homology years before PR people popularized the term "data science".
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 13:14
2
$begingroup$
@RodrigodeAzevedo : My understanding is that Zomorodian was motivated by computational questions in topology, which I consider to fall under the umbrella of "data science". It may be anachronistic to say that people were working on data science before the term "data science" was in vogue, but that doesn't bother me much. As an analogy, I'm happy to say that people were studying "linear algebra" in the 19th century (and perhaps even earlier) even if they didn't use that term at the time.
$endgroup$
– Timothy Chow
Oct 5 at 15:14
$begingroup$
Thank you for taking the time to provide such a detailed reply. Indeed, a field can exist before it has an official name. My (hairsplitter) argument was that Zomorodian may have viewed his work as topological data analysis, not as data science. Apparently, "analysis" was not authoritative enough. Or, perhaps, data science is a superset of data analysis.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:24
$begingroup$
By the way, this expository article may complement your answer nicely.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:31
add a comment
|
$begingroup$
The Mathematics of Data may go some way towards answering your question. As one example of a mathematically interesting topic that is motivated by data science, you might want to look at the concept of persistent homology.
$endgroup$
The Mathematics of Data may go some way towards answering your question. As one example of a mathematically interesting topic that is motivated by data science, you might want to look at the concept of persistent homology.
answered Oct 2 at 20:42
community wiki
Timothy Chow
$begingroup$
If I may split hairs, was it really "motivated by data science"? Afra Zomorodian was working on persistent homology years before PR people popularized the term "data science".
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 13:14
2
$begingroup$
@RodrigodeAzevedo : My understanding is that Zomorodian was motivated by computational questions in topology, which I consider to fall under the umbrella of "data science". It may be anachronistic to say that people were working on data science before the term "data science" was in vogue, but that doesn't bother me much. As an analogy, I'm happy to say that people were studying "linear algebra" in the 19th century (and perhaps even earlier) even if they didn't use that term at the time.
$endgroup$
– Timothy Chow
Oct 5 at 15:14
$begingroup$
Thank you for taking the time to provide such a detailed reply. Indeed, a field can exist before it has an official name. My (hairsplitter) argument was that Zomorodian may have viewed his work as topological data analysis, not as data science. Apparently, "analysis" was not authoritative enough. Or, perhaps, data science is a superset of data analysis.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:24
$begingroup$
By the way, this expository article may complement your answer nicely.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:31
add a comment
|
$begingroup$
If I may split hairs, was it really "motivated by data science"? Afra Zomorodian was working on persistent homology years before PR people popularized the term "data science".
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 13:14
2
$begingroup$
@RodrigodeAzevedo : My understanding is that Zomorodian was motivated by computational questions in topology, which I consider to fall under the umbrella of "data science". It may be anachronistic to say that people were working on data science before the term "data science" was in vogue, but that doesn't bother me much. As an analogy, I'm happy to say that people were studying "linear algebra" in the 19th century (and perhaps even earlier) even if they didn't use that term at the time.
$endgroup$
– Timothy Chow
Oct 5 at 15:14
$begingroup$
Thank you for taking the time to provide such a detailed reply. Indeed, a field can exist before it has an official name. My (hairsplitter) argument was that Zomorodian may have viewed his work as topological data analysis, not as data science. Apparently, "analysis" was not authoritative enough. Or, perhaps, data science is a superset of data analysis.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:24
$begingroup$
By the way, this expository article may complement your answer nicely.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:31
$begingroup$
If I may split hairs, was it really "motivated by data science"? Afra Zomorodian was working on persistent homology years before PR people popularized the term "data science".
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 13:14
$begingroup$
If I may split hairs, was it really "motivated by data science"? Afra Zomorodian was working on persistent homology years before PR people popularized the term "data science".
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 13:14
2
2
$begingroup$
@RodrigodeAzevedo : My understanding is that Zomorodian was motivated by computational questions in topology, which I consider to fall under the umbrella of "data science". It may be anachronistic to say that people were working on data science before the term "data science" was in vogue, but that doesn't bother me much. As an analogy, I'm happy to say that people were studying "linear algebra" in the 19th century (and perhaps even earlier) even if they didn't use that term at the time.
$endgroup$
– Timothy Chow
Oct 5 at 15:14
$begingroup$
@RodrigodeAzevedo : My understanding is that Zomorodian was motivated by computational questions in topology, which I consider to fall under the umbrella of "data science". It may be anachronistic to say that people were working on data science before the term "data science" was in vogue, but that doesn't bother me much. As an analogy, I'm happy to say that people were studying "linear algebra" in the 19th century (and perhaps even earlier) even if they didn't use that term at the time.
$endgroup$
– Timothy Chow
Oct 5 at 15:14
$begingroup$
Thank you for taking the time to provide such a detailed reply. Indeed, a field can exist before it has an official name. My (hairsplitter) argument was that Zomorodian may have viewed his work as topological data analysis, not as data science. Apparently, "analysis" was not authoritative enough. Or, perhaps, data science is a superset of data analysis.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:24
$begingroup$
Thank you for taking the time to provide such a detailed reply. Indeed, a field can exist before it has an official name. My (hairsplitter) argument was that Zomorodian may have viewed his work as topological data analysis, not as data science. Apparently, "analysis" was not authoritative enough. Or, perhaps, data science is a superset of data analysis.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:24
$begingroup$
By the way, this expository article may complement your answer nicely.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:31
$begingroup$
By the way, this expository article may complement your answer nicely.
$endgroup$
– Rodrigo de Azevedo
Oct 5 at 15:31
add a comment
|
$begingroup$
To begin, there is a family of results which are sometimes referred to as "No free lunch" theorems. Each of these results, in their own way, asserts that any optimization algorithm is just as good as any other if you average over the space of all optimization problems. On the other hand, we know that in specific domains some algorithms vastly outperform all others (that we're aware of) - for detecting objects in images, convolutional neural networks are state of the art, and in computational linguistics the best you can do for most tasks is a neural network with an LSTM or transformer architecture. In both of these cases, the state of the art algorithms perform vastly better than, say, logistic regression.
How can we reconcile the "No free lunch" theorems with our empirical experience? The answer has to be that object detection in images and standard NLP tasks aren't "typical" optimization problems - some combination of the data and the task has some special structure which particular neural architectures are unusually good at detecting. What is this structure? Why are known algorithms so good at learning it? Can we generate new algorithms (neural or otherwise) that are even better?
These are all essentially math problems, sitting somewhere at the intersection of optimization theory and information theory. They are pretty wide open - except in simple cases like logistic regression - there isn't much in the way of theory which characterizes an algorithm as optimal for a particular task among the space of all possible optimization algorithms. An influential paper from 2014 proposes to use the theory of renormalization groups from physics to tackle this question, and there are other attempts using gauge theory or the principle of maximum entropy. Another line of attack involves the so-called "manifold hypothesis", which asserts that real world data sets (presented as a set of points in euclidean space) tend to cluster near a high codimension submanifold.
That's my answer for your main technical question, but I'll also make a remark about the academic politics. There are significantly more job openings in data science than there are people to fill them, so much so that many companies (like AirBnB) have found it cheaper and easier to start internal data science training programs than hire outside people. This problem is not likely to go away any time soon, so it's sensible to incentivize universities to start degree programs in the field, even if it's not yet a fully fleshed-out academic discipline. This has plenty of historical precedent - for instance, academic programs in forensic science and financial mathematics sprouted in the same way in the 1990's.
$endgroup$
$begingroup$
This is such an interesting framing!
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 23:11
1
$begingroup$
Very interesting answer, but the first paragraph seems to blur the distinction between an optimization algorithm and a classification algorithm. A convolutional neural network is not an optimization algorithm, for example.
$endgroup$
– littleO
Oct 4 at 3:51
1
$begingroup$
@littleO Sure it is! Rather, training a classification algorithm is. Deep neural networks, including CNN's, have a large number of parameters which specify how data moves between the neurons, and the process of training a neural network corresponds to finding - usually via some form of gradient descent - a collection of parameters which minimizes an objective function. In the case of classification problems the objective function is chosen to punish classification errors in training data - cross entropy is a typical choice. Most other classification algorithms can be viewed similarly.
$endgroup$
– Paul Siegel
Oct 4 at 5:34
$begingroup$
I know that one trains a neural network (for example) by using an optimization algorithm such as stochastic gradient descent; I was just making the point that there's a distinction between a classifier and the optimization algorithm which is used to train the classifier.
$endgroup$
– littleO
Oct 4 at 5:58
1
$begingroup$
@littleO Fair enough, I guess. For the purposes of this discussion "optimization algorithm" means an algorithm which takes as input a function defined on a finite set of points in some space and produces as output a probability distribution which "best approximates" the input function in an appropriate sense. It was not intended to refer to the actual numerical analysis used to construct the probability distribution. I tried to suppress these details deliberately because the same remarks apply to a broader class of data science problems than just classification.
$endgroup$
– Paul Siegel
Oct 4 at 6:33
|
show 1 more comment
$begingroup$
To begin, there is a family of results which are sometimes referred to as "No free lunch" theorems. Each of these results, in their own way, asserts that any optimization algorithm is just as good as any other if you average over the space of all optimization problems. On the other hand, we know that in specific domains some algorithms vastly outperform all others (that we're aware of) - for detecting objects in images, convolutional neural networks are state of the art, and in computational linguistics the best you can do for most tasks is a neural network with an LSTM or transformer architecture. In both of these cases, the state of the art algorithms perform vastly better than, say, logistic regression.
How can we reconcile the "No free lunch" theorems with our empirical experience? The answer has to be that object detection in images and standard NLP tasks aren't "typical" optimization problems - some combination of the data and the task has some special structure which particular neural architectures are unusually good at detecting. What is this structure? Why are known algorithms so good at learning it? Can we generate new algorithms (neural or otherwise) that are even better?
These are all essentially math problems, sitting somewhere at the intersection of optimization theory and information theory. They are pretty wide open - except in simple cases like logistic regression - there isn't much in the way of theory which characterizes an algorithm as optimal for a particular task among the space of all possible optimization algorithms. An influential paper from 2014 proposes to use the theory of renormalization groups from physics to tackle this question, and there are other attempts using gauge theory or the principle of maximum entropy. Another line of attack involves the so-called "manifold hypothesis", which asserts that real world data sets (presented as a set of points in euclidean space) tend to cluster near a high codimension submanifold.
That's my answer for your main technical question, but I'll also make a remark about the academic politics. There are significantly more job openings in data science than there are people to fill them, so much so that many companies (like AirBnB) have found it cheaper and easier to start internal data science training programs than hire outside people. This problem is not likely to go away any time soon, so it's sensible to incentivize universities to start degree programs in the field, even if it's not yet a fully fleshed-out academic discipline. This has plenty of historical precedent - for instance, academic programs in forensic science and financial mathematics sprouted in the same way in the 1990's.
$endgroup$
$begingroup$
This is such an interesting framing!
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 23:11
1
$begingroup$
Very interesting answer, but the first paragraph seems to blur the distinction between an optimization algorithm and a classification algorithm. A convolutional neural network is not an optimization algorithm, for example.
$endgroup$
– littleO
Oct 4 at 3:51
1
$begingroup$
@littleO Sure it is! Rather, training a classification algorithm is. Deep neural networks, including CNN's, have a large number of parameters which specify how data moves between the neurons, and the process of training a neural network corresponds to finding - usually via some form of gradient descent - a collection of parameters which minimizes an objective function. In the case of classification problems the objective function is chosen to punish classification errors in training data - cross entropy is a typical choice. Most other classification algorithms can be viewed similarly.
$endgroup$
– Paul Siegel
Oct 4 at 5:34
$begingroup$
I know that one trains a neural network (for example) by using an optimization algorithm such as stochastic gradient descent; I was just making the point that there's a distinction between a classifier and the optimization algorithm which is used to train the classifier.
$endgroup$
– littleO
Oct 4 at 5:58
1
$begingroup$
@littleO Fair enough, I guess. For the purposes of this discussion "optimization algorithm" means an algorithm which takes as input a function defined on a finite set of points in some space and produces as output a probability distribution which "best approximates" the input function in an appropriate sense. It was not intended to refer to the actual numerical analysis used to construct the probability distribution. I tried to suppress these details deliberately because the same remarks apply to a broader class of data science problems than just classification.
$endgroup$
– Paul Siegel
Oct 4 at 6:33
|
show 1 more comment
$begingroup$
To begin, there is a family of results which are sometimes referred to as "No free lunch" theorems. Each of these results, in their own way, asserts that any optimization algorithm is just as good as any other if you average over the space of all optimization problems. On the other hand, we know that in specific domains some algorithms vastly outperform all others (that we're aware of) - for detecting objects in images, convolutional neural networks are state of the art, and in computational linguistics the best you can do for most tasks is a neural network with an LSTM or transformer architecture. In both of these cases, the state of the art algorithms perform vastly better than, say, logistic regression.
How can we reconcile the "No free lunch" theorems with our empirical experience? The answer has to be that object detection in images and standard NLP tasks aren't "typical" optimization problems - some combination of the data and the task has some special structure which particular neural architectures are unusually good at detecting. What is this structure? Why are known algorithms so good at learning it? Can we generate new algorithms (neural or otherwise) that are even better?
These are all essentially math problems, sitting somewhere at the intersection of optimization theory and information theory. They are pretty wide open - except in simple cases like logistic regression - there isn't much in the way of theory which characterizes an algorithm as optimal for a particular task among the space of all possible optimization algorithms. An influential paper from 2014 proposes to use the theory of renormalization groups from physics to tackle this question, and there are other attempts using gauge theory or the principle of maximum entropy. Another line of attack involves the so-called "manifold hypothesis", which asserts that real world data sets (presented as a set of points in euclidean space) tend to cluster near a high codimension submanifold.
That's my answer for your main technical question, but I'll also make a remark about the academic politics. There are significantly more job openings in data science than there are people to fill them, so much so that many companies (like AirBnB) have found it cheaper and easier to start internal data science training programs than hire outside people. This problem is not likely to go away any time soon, so it's sensible to incentivize universities to start degree programs in the field, even if it's not yet a fully fleshed-out academic discipline. This has plenty of historical precedent - for instance, academic programs in forensic science and financial mathematics sprouted in the same way in the 1990's.
$endgroup$
To begin, there is a family of results which are sometimes referred to as "No free lunch" theorems. Each of these results, in their own way, asserts that any optimization algorithm is just as good as any other if you average over the space of all optimization problems. On the other hand, we know that in specific domains some algorithms vastly outperform all others (that we're aware of) - for detecting objects in images, convolutional neural networks are state of the art, and in computational linguistics the best you can do for most tasks is a neural network with an LSTM or transformer architecture. In both of these cases, the state of the art algorithms perform vastly better than, say, logistic regression.
How can we reconcile the "No free lunch" theorems with our empirical experience? The answer has to be that object detection in images and standard NLP tasks aren't "typical" optimization problems - some combination of the data and the task has some special structure which particular neural architectures are unusually good at detecting. What is this structure? Why are known algorithms so good at learning it? Can we generate new algorithms (neural or otherwise) that are even better?
These are all essentially math problems, sitting somewhere at the intersection of optimization theory and information theory. They are pretty wide open - except in simple cases like logistic regression - there isn't much in the way of theory which characterizes an algorithm as optimal for a particular task among the space of all possible optimization algorithms. An influential paper from 2014 proposes to use the theory of renormalization groups from physics to tackle this question, and there are other attempts using gauge theory or the principle of maximum entropy. Another line of attack involves the so-called "manifold hypothesis", which asserts that real world data sets (presented as a set of points in euclidean space) tend to cluster near a high codimension submanifold.
That's my answer for your main technical question, but I'll also make a remark about the academic politics. There are significantly more job openings in data science than there are people to fill them, so much so that many companies (like AirBnB) have found it cheaper and easier to start internal data science training programs than hire outside people. This problem is not likely to go away any time soon, so it's sensible to incentivize universities to start degree programs in the field, even if it's not yet a fully fleshed-out academic discipline. This has plenty of historical precedent - for instance, academic programs in forensic science and financial mathematics sprouted in the same way in the 1990's.
edited Oct 3 at 23:02
community wiki
2 revs
Paul Siegel
$begingroup$
This is such an interesting framing!
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 23:11
1
$begingroup$
Very interesting answer, but the first paragraph seems to blur the distinction between an optimization algorithm and a classification algorithm. A convolutional neural network is not an optimization algorithm, for example.
$endgroup$
– littleO
Oct 4 at 3:51
1
$begingroup$
@littleO Sure it is! Rather, training a classification algorithm is. Deep neural networks, including CNN's, have a large number of parameters which specify how data moves between the neurons, and the process of training a neural network corresponds to finding - usually via some form of gradient descent - a collection of parameters which minimizes an objective function. In the case of classification problems the objective function is chosen to punish classification errors in training data - cross entropy is a typical choice. Most other classification algorithms can be viewed similarly.
$endgroup$
– Paul Siegel
Oct 4 at 5:34
$begingroup$
I know that one trains a neural network (for example) by using an optimization algorithm such as stochastic gradient descent; I was just making the point that there's a distinction between a classifier and the optimization algorithm which is used to train the classifier.
$endgroup$
– littleO
Oct 4 at 5:58
1
$begingroup$
@littleO Fair enough, I guess. For the purposes of this discussion "optimization algorithm" means an algorithm which takes as input a function defined on a finite set of points in some space and produces as output a probability distribution which "best approximates" the input function in an appropriate sense. It was not intended to refer to the actual numerical analysis used to construct the probability distribution. I tried to suppress these details deliberately because the same remarks apply to a broader class of data science problems than just classification.
$endgroup$
– Paul Siegel
Oct 4 at 6:33
|
show 1 more comment
$begingroup$
This is such an interesting framing!
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 23:11
1
$begingroup$
Very interesting answer, but the first paragraph seems to blur the distinction between an optimization algorithm and a classification algorithm. A convolutional neural network is not an optimization algorithm, for example.
$endgroup$
– littleO
Oct 4 at 3:51
1
$begingroup$
@littleO Sure it is! Rather, training a classification algorithm is. Deep neural networks, including CNN's, have a large number of parameters which specify how data moves between the neurons, and the process of training a neural network corresponds to finding - usually via some form of gradient descent - a collection of parameters which minimizes an objective function. In the case of classification problems the objective function is chosen to punish classification errors in training data - cross entropy is a typical choice. Most other classification algorithms can be viewed similarly.
$endgroup$
– Paul Siegel
Oct 4 at 5:34
$begingroup$
I know that one trains a neural network (for example) by using an optimization algorithm such as stochastic gradient descent; I was just making the point that there's a distinction between a classifier and the optimization algorithm which is used to train the classifier.
$endgroup$
– littleO
Oct 4 at 5:58
1
$begingroup$
@littleO Fair enough, I guess. For the purposes of this discussion "optimization algorithm" means an algorithm which takes as input a function defined on a finite set of points in some space and produces as output a probability distribution which "best approximates" the input function in an appropriate sense. It was not intended to refer to the actual numerical analysis used to construct the probability distribution. I tried to suppress these details deliberately because the same remarks apply to a broader class of data science problems than just classification.
$endgroup$
– Paul Siegel
Oct 4 at 6:33
$begingroup$
This is such an interesting framing!
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 23:11
$begingroup$
This is such an interesting framing!
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 23:11
1
1
$begingroup$
Very interesting answer, but the first paragraph seems to blur the distinction between an optimization algorithm and a classification algorithm. A convolutional neural network is not an optimization algorithm, for example.
$endgroup$
– littleO
Oct 4 at 3:51
$begingroup$
Very interesting answer, but the first paragraph seems to blur the distinction between an optimization algorithm and a classification algorithm. A convolutional neural network is not an optimization algorithm, for example.
$endgroup$
– littleO
Oct 4 at 3:51
1
1
$begingroup$
@littleO Sure it is! Rather, training a classification algorithm is. Deep neural networks, including CNN's, have a large number of parameters which specify how data moves between the neurons, and the process of training a neural network corresponds to finding - usually via some form of gradient descent - a collection of parameters which minimizes an objective function. In the case of classification problems the objective function is chosen to punish classification errors in training data - cross entropy is a typical choice. Most other classification algorithms can be viewed similarly.
$endgroup$
– Paul Siegel
Oct 4 at 5:34
$begingroup$
@littleO Sure it is! Rather, training a classification algorithm is. Deep neural networks, including CNN's, have a large number of parameters which specify how data moves between the neurons, and the process of training a neural network corresponds to finding - usually via some form of gradient descent - a collection of parameters which minimizes an objective function. In the case of classification problems the objective function is chosen to punish classification errors in training data - cross entropy is a typical choice. Most other classification algorithms can be viewed similarly.
$endgroup$
– Paul Siegel
Oct 4 at 5:34
$begingroup$
I know that one trains a neural network (for example) by using an optimization algorithm such as stochastic gradient descent; I was just making the point that there's a distinction between a classifier and the optimization algorithm which is used to train the classifier.
$endgroup$
– littleO
Oct 4 at 5:58
$begingroup$
I know that one trains a neural network (for example) by using an optimization algorithm such as stochastic gradient descent; I was just making the point that there's a distinction between a classifier and the optimization algorithm which is used to train the classifier.
$endgroup$
– littleO
Oct 4 at 5:58
1
1
$begingroup$
@littleO Fair enough, I guess. For the purposes of this discussion "optimization algorithm" means an algorithm which takes as input a function defined on a finite set of points in some space and produces as output a probability distribution which "best approximates" the input function in an appropriate sense. It was not intended to refer to the actual numerical analysis used to construct the probability distribution. I tried to suppress these details deliberately because the same remarks apply to a broader class of data science problems than just classification.
$endgroup$
– Paul Siegel
Oct 4 at 6:33
$begingroup$
@littleO Fair enough, I guess. For the purposes of this discussion "optimization algorithm" means an algorithm which takes as input a function defined on a finite set of points in some space and produces as output a probability distribution which "best approximates" the input function in an appropriate sense. It was not intended to refer to the actual numerical analysis used to construct the probability distribution. I tried to suppress these details deliberately because the same remarks apply to a broader class of data science problems than just classification.
$endgroup$
– Paul Siegel
Oct 4 at 6:33
|
show 1 more comment
$begingroup$
I think the problem is that "data science" means many different things to different people. To you it connotes applying statistics to marketing, but for others it covers large swaths of probability, statistics, machine learning, even things like geometry, etc.
But this can be an opportunity too. If I wade into the politics of hiring just a little and also interpret your question as "why would data science professors be principled additions to math departments?" ... well, if a department can secure a line of funding for "data science", they may not choose to hire a marketing or advertising person, they might choose to fill it with a probabilist or so on.
$endgroup$
2
$begingroup$
This is too rosy: a hire in data science is unlikely to be or to call themselves a probabilist. The first search page of google.com/… turns up one "amateur probabilist", two references to Leo Breiman who died in 2005, and several inapt uses of the word "probabilist".
$endgroup$
– Matt F.
Oct 3 at 21:05
$begingroup$
@MattF. I agree getting a probabilist in particular for such a position is a stretch.
$endgroup$
– usul
Oct 3 at 22:23
$begingroup$
@MattF While you might be right, I also rarely hear the word "probabilist" and hear "probability theorist" much more.
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:26
$begingroup$
Yes, but searching for “probability theorist” + “data science” turns up: 1) Cosma Shalizi, a Carnegie-Mellon statistician, saying his advisor was a probability theorist; 2) Scott Sheffield, an MIT probability theorist, at his courtesy appointment in the data science program that he does not bother to mention on his home page; 3) Robert Wolpert, a data scientist whose career was set by considering but not taking a class from a noted probability theorist; and 4+) lists with disjoint sublists of data scientists and probability theorists. So that connection too is mostly historical.
$endgroup$
– Matt F.
Oct 4 at 2:32
add a comment
|
$begingroup$
I think the problem is that "data science" means many different things to different people. To you it connotes applying statistics to marketing, but for others it covers large swaths of probability, statistics, machine learning, even things like geometry, etc.
But this can be an opportunity too. If I wade into the politics of hiring just a little and also interpret your question as "why would data science professors be principled additions to math departments?" ... well, if a department can secure a line of funding for "data science", they may not choose to hire a marketing or advertising person, they might choose to fill it with a probabilist or so on.
$endgroup$
2
$begingroup$
This is too rosy: a hire in data science is unlikely to be or to call themselves a probabilist. The first search page of google.com/… turns up one "amateur probabilist", two references to Leo Breiman who died in 2005, and several inapt uses of the word "probabilist".
$endgroup$
– Matt F.
Oct 3 at 21:05
$begingroup$
@MattF. I agree getting a probabilist in particular for such a position is a stretch.
$endgroup$
– usul
Oct 3 at 22:23
$begingroup$
@MattF While you might be right, I also rarely hear the word "probabilist" and hear "probability theorist" much more.
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:26
$begingroup$
Yes, but searching for “probability theorist” + “data science” turns up: 1) Cosma Shalizi, a Carnegie-Mellon statistician, saying his advisor was a probability theorist; 2) Scott Sheffield, an MIT probability theorist, at his courtesy appointment in the data science program that he does not bother to mention on his home page; 3) Robert Wolpert, a data scientist whose career was set by considering but not taking a class from a noted probability theorist; and 4+) lists with disjoint sublists of data scientists and probability theorists. So that connection too is mostly historical.
$endgroup$
– Matt F.
Oct 4 at 2:32
add a comment
|
$begingroup$
I think the problem is that "data science" means many different things to different people. To you it connotes applying statistics to marketing, but for others it covers large swaths of probability, statistics, machine learning, even things like geometry, etc.
But this can be an opportunity too. If I wade into the politics of hiring just a little and also interpret your question as "why would data science professors be principled additions to math departments?" ... well, if a department can secure a line of funding for "data science", they may not choose to hire a marketing or advertising person, they might choose to fill it with a probabilist or so on.
$endgroup$
I think the problem is that "data science" means many different things to different people. To you it connotes applying statistics to marketing, but for others it covers large swaths of probability, statistics, machine learning, even things like geometry, etc.
But this can be an opportunity too. If I wade into the politics of hiring just a little and also interpret your question as "why would data science professors be principled additions to math departments?" ... well, if a department can secure a line of funding for "data science", they may not choose to hire a marketing or advertising person, they might choose to fill it with a probabilist or so on.
answered Oct 3 at 19:38
community wiki
usul
2
$begingroup$
This is too rosy: a hire in data science is unlikely to be or to call themselves a probabilist. The first search page of google.com/… turns up one "amateur probabilist", two references to Leo Breiman who died in 2005, and several inapt uses of the word "probabilist".
$endgroup$
– Matt F.
Oct 3 at 21:05
$begingroup$
@MattF. I agree getting a probabilist in particular for such a position is a stretch.
$endgroup$
– usul
Oct 3 at 22:23
$begingroup$
@MattF While you might be right, I also rarely hear the word "probabilist" and hear "probability theorist" much more.
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:26
$begingroup$
Yes, but searching for “probability theorist” + “data science” turns up: 1) Cosma Shalizi, a Carnegie-Mellon statistician, saying his advisor was a probability theorist; 2) Scott Sheffield, an MIT probability theorist, at his courtesy appointment in the data science program that he does not bother to mention on his home page; 3) Robert Wolpert, a data scientist whose career was set by considering but not taking a class from a noted probability theorist; and 4+) lists with disjoint sublists of data scientists and probability theorists. So that connection too is mostly historical.
$endgroup$
– Matt F.
Oct 4 at 2:32
add a comment
|
2
$begingroup$
This is too rosy: a hire in data science is unlikely to be or to call themselves a probabilist. The first search page of google.com/… turns up one "amateur probabilist", two references to Leo Breiman who died in 2005, and several inapt uses of the word "probabilist".
$endgroup$
– Matt F.
Oct 3 at 21:05
$begingroup$
@MattF. I agree getting a probabilist in particular for such a position is a stretch.
$endgroup$
– usul
Oct 3 at 22:23
$begingroup$
@MattF While you might be right, I also rarely hear the word "probabilist" and hear "probability theorist" much more.
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:26
$begingroup$
Yes, but searching for “probability theorist” + “data science” turns up: 1) Cosma Shalizi, a Carnegie-Mellon statistician, saying his advisor was a probability theorist; 2) Scott Sheffield, an MIT probability theorist, at his courtesy appointment in the data science program that he does not bother to mention on his home page; 3) Robert Wolpert, a data scientist whose career was set by considering but not taking a class from a noted probability theorist; and 4+) lists with disjoint sublists of data scientists and probability theorists. So that connection too is mostly historical.
$endgroup$
– Matt F.
Oct 4 at 2:32
2
2
$begingroup$
This is too rosy: a hire in data science is unlikely to be or to call themselves a probabilist. The first search page of google.com/… turns up one "amateur probabilist", two references to Leo Breiman who died in 2005, and several inapt uses of the word "probabilist".
$endgroup$
– Matt F.
Oct 3 at 21:05
$begingroup$
This is too rosy: a hire in data science is unlikely to be or to call themselves a probabilist. The first search page of google.com/… turns up one "amateur probabilist", two references to Leo Breiman who died in 2005, and several inapt uses of the word "probabilist".
$endgroup$
– Matt F.
Oct 3 at 21:05
$begingroup$
@MattF. I agree getting a probabilist in particular for such a position is a stretch.
$endgroup$
– usul
Oct 3 at 22:23
$begingroup$
@MattF. I agree getting a probabilist in particular for such a position is a stretch.
$endgroup$
– usul
Oct 3 at 22:23
$begingroup$
@MattF While you might be right, I also rarely hear the word "probabilist" and hear "probability theorist" much more.
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:26
$begingroup$
@MattF While you might be right, I also rarely hear the word "probabilist" and hear "probability theorist" much more.
$endgroup$
– Juan Sebastian Lozano
Oct 3 at 22:26
$begingroup$
Yes, but searching for “probability theorist” + “data science” turns up: 1) Cosma Shalizi, a Carnegie-Mellon statistician, saying his advisor was a probability theorist; 2) Scott Sheffield, an MIT probability theorist, at his courtesy appointment in the data science program that he does not bother to mention on his home page; 3) Robert Wolpert, a data scientist whose career was set by considering but not taking a class from a noted probability theorist; and 4+) lists with disjoint sublists of data scientists and probability theorists. So that connection too is mostly historical.
$endgroup$
– Matt F.
Oct 4 at 2:32
$begingroup$
Yes, but searching for “probability theorist” + “data science” turns up: 1) Cosma Shalizi, a Carnegie-Mellon statistician, saying his advisor was a probability theorist; 2) Scott Sheffield, an MIT probability theorist, at his courtesy appointment in the data science program that he does not bother to mention on his home page; 3) Robert Wolpert, a data scientist whose career was set by considering but not taking a class from a noted probability theorist; and 4+) lists with disjoint sublists of data scientists and probability theorists. So that connection too is mostly historical.
$endgroup$
– Matt F.
Oct 4 at 2:32
add a comment
|
$begingroup$
I studied data science in the past. My personal experience from studying data science is that about 80% of it was not really interesting. So it really depends on what kind of sub-modules you're studying within that field. It may be just because of my personal taste in learning, it might be different for others though. But I would like to mention the following:
What is data? This is really something that is very difficult to answer. Many people have different definitions of data. But it really boils down to information. Data = information. Then we can ask, what is information? Without me going into to many details I can mention that a computer uses bits of information. A bit can have one of two states. zero or one. But you probably allready knew that. Another way is that we can say that information (or data) is collection of objects (collection of bits). How these individual units can behave in different complex ways, or in simple ways. But data can be so much more. It can be higher layers of data also. This could even be an own research subject.
What is science? Well, science is everywhere. Science has an history. Scientific things have been invented and are being research on today and in the past. I will not say much more because almost everyone has an notion of what science is.
So when we put these two things together data and science, we have something that is really difficult to actually describe in a concrete way. One suggestion is to look at what data science subjects are in curriculum programs on different universities and schools etc. What you will find there is most likely different kinds of subjects. So it is a broad field. You should focus on narrower ones, find the subjects that you personally like or what you are good at. So there are different subjects to learn in different universities. For example at my university, we learned Java instead of C/C++. So what programming languages you learn could be different. Of course this is not really all about data or information, but a programming language is a tool one can use to make programs that processes data information.
--- Edit: you can do scientific research on the data you have using that programming language. ---
Programming languages has some similarities and some differences. But I think its the similarities that is important to learn, but some teachers are not really focusing on that aspect, only learn away what the "specific" language can do.
Personally, I believe to learn pure mathematics and/or the field of boolean algebra, maybe a bit of electronics and such stuff is quite fruitful and I think some areas need people who knows a little about everything and alot about a few concrete subject areas.
To conclude, data science is interesting if you look at it at some point where you are focused and deep into some problem or subject. In schools and such I did not find data science very interesting because it was very divided up into modules that I really did not care much about. It was some very difficult concepts that are only used in places where big systems and perhaps networking systems, I did not have much interest in moving big data around. I was more interested in what elements they are made up of and smaller units that can be processed into a whole. There so much more to say, I have just scratched the surface of my own personal experience and knowledge about this issue. I tried to not be to formal and use to many complex words.
$endgroup$
add a comment
|
$begingroup$
I studied data science in the past. My personal experience from studying data science is that about 80% of it was not really interesting. So it really depends on what kind of sub-modules you're studying within that field. It may be just because of my personal taste in learning, it might be different for others though. But I would like to mention the following:
What is data? This is really something that is very difficult to answer. Many people have different definitions of data. But it really boils down to information. Data = information. Then we can ask, what is information? Without me going into to many details I can mention that a computer uses bits of information. A bit can have one of two states. zero or one. But you probably allready knew that. Another way is that we can say that information (or data) is collection of objects (collection of bits). How these individual units can behave in different complex ways, or in simple ways. But data can be so much more. It can be higher layers of data also. This could even be an own research subject.
What is science? Well, science is everywhere. Science has an history. Scientific things have been invented and are being research on today and in the past. I will not say much more because almost everyone has an notion of what science is.
So when we put these two things together data and science, we have something that is really difficult to actually describe in a concrete way. One suggestion is to look at what data science subjects are in curriculum programs on different universities and schools etc. What you will find there is most likely different kinds of subjects. So it is a broad field. You should focus on narrower ones, find the subjects that you personally like or what you are good at. So there are different subjects to learn in different universities. For example at my university, we learned Java instead of C/C++. So what programming languages you learn could be different. Of course this is not really all about data or information, but a programming language is a tool one can use to make programs that processes data information.
--- Edit: you can do scientific research on the data you have using that programming language. ---
Programming languages has some similarities and some differences. But I think its the similarities that is important to learn, but some teachers are not really focusing on that aspect, only learn away what the "specific" language can do.
Personally, I believe to learn pure mathematics and/or the field of boolean algebra, maybe a bit of electronics and such stuff is quite fruitful and I think some areas need people who knows a little about everything and alot about a few concrete subject areas.
To conclude, data science is interesting if you look at it at some point where you are focused and deep into some problem or subject. In schools and such I did not find data science very interesting because it was very divided up into modules that I really did not care much about. It was some very difficult concepts that are only used in places where big systems and perhaps networking systems, I did not have much interest in moving big data around. I was more interested in what elements they are made up of and smaller units that can be processed into a whole. There so much more to say, I have just scratched the surface of my own personal experience and knowledge about this issue. I tried to not be to formal and use to many complex words.
$endgroup$
add a comment
|
$begingroup$
I studied data science in the past. My personal experience from studying data science is that about 80% of it was not really interesting. So it really depends on what kind of sub-modules you're studying within that field. It may be just because of my personal taste in learning, it might be different for others though. But I would like to mention the following:
What is data? This is really something that is very difficult to answer. Many people have different definitions of data. But it really boils down to information. Data = information. Then we can ask, what is information? Without me going into to many details I can mention that a computer uses bits of information. A bit can have one of two states. zero or one. But you probably allready knew that. Another way is that we can say that information (or data) is collection of objects (collection of bits). How these individual units can behave in different complex ways, or in simple ways. But data can be so much more. It can be higher layers of data also. This could even be an own research subject.
What is science? Well, science is everywhere. Science has an history. Scientific things have been invented and are being research on today and in the past. I will not say much more because almost everyone has an notion of what science is.
So when we put these two things together data and science, we have something that is really difficult to actually describe in a concrete way. One suggestion is to look at what data science subjects are in curriculum programs on different universities and schools etc. What you will find there is most likely different kinds of subjects. So it is a broad field. You should focus on narrower ones, find the subjects that you personally like or what you are good at. So there are different subjects to learn in different universities. For example at my university, we learned Java instead of C/C++. So what programming languages you learn could be different. Of course this is not really all about data or information, but a programming language is a tool one can use to make programs that processes data information.
--- Edit: you can do scientific research on the data you have using that programming language. ---
Programming languages has some similarities and some differences. But I think its the similarities that is important to learn, but some teachers are not really focusing on that aspect, only learn away what the "specific" language can do.
Personally, I believe to learn pure mathematics and/or the field of boolean algebra, maybe a bit of electronics and such stuff is quite fruitful and I think some areas need people who knows a little about everything and alot about a few concrete subject areas.
To conclude, data science is interesting if you look at it at some point where you are focused and deep into some problem or subject. In schools and such I did not find data science very interesting because it was very divided up into modules that I really did not care much about. It was some very difficult concepts that are only used in places where big systems and perhaps networking systems, I did not have much interest in moving big data around. I was more interested in what elements they are made up of and smaller units that can be processed into a whole. There so much more to say, I have just scratched the surface of my own personal experience and knowledge about this issue. I tried to not be to formal and use to many complex words.
$endgroup$
I studied data science in the past. My personal experience from studying data science is that about 80% of it was not really interesting. So it really depends on what kind of sub-modules you're studying within that field. It may be just because of my personal taste in learning, it might be different for others though. But I would like to mention the following:
What is data? This is really something that is very difficult to answer. Many people have different definitions of data. But it really boils down to information. Data = information. Then we can ask, what is information? Without me going into to many details I can mention that a computer uses bits of information. A bit can have one of two states. zero or one. But you probably allready knew that. Another way is that we can say that information (or data) is collection of objects (collection of bits). How these individual units can behave in different complex ways, or in simple ways. But data can be so much more. It can be higher layers of data also. This could even be an own research subject.
What is science? Well, science is everywhere. Science has an history. Scientific things have been invented and are being research on today and in the past. I will not say much more because almost everyone has an notion of what science is.
So when we put these two things together data and science, we have something that is really difficult to actually describe in a concrete way. One suggestion is to look at what data science subjects are in curriculum programs on different universities and schools etc. What you will find there is most likely different kinds of subjects. So it is a broad field. You should focus on narrower ones, find the subjects that you personally like or what you are good at. So there are different subjects to learn in different universities. For example at my university, we learned Java instead of C/C++. So what programming languages you learn could be different. Of course this is not really all about data or information, but a programming language is a tool one can use to make programs that processes data information.
--- Edit: you can do scientific research on the data you have using that programming language. ---
Programming languages has some similarities and some differences. But I think its the similarities that is important to learn, but some teachers are not really focusing on that aspect, only learn away what the "specific" language can do.
Personally, I believe to learn pure mathematics and/or the field of boolean algebra, maybe a bit of electronics and such stuff is quite fruitful and I think some areas need people who knows a little about everything and alot about a few concrete subject areas.
To conclude, data science is interesting if you look at it at some point where you are focused and deep into some problem or subject. In schools and such I did not find data science very interesting because it was very divided up into modules that I really did not care much about. It was some very difficult concepts that are only used in places where big systems and perhaps networking systems, I did not have much interest in moving big data around. I was more interested in what elements they are made up of and smaller units that can be processed into a whole. There so much more to say, I have just scratched the surface of my own personal experience and knowledge about this issue. I tried to not be to formal and use to many complex words.
edited Oct 3 at 17:10
community wiki
2 revs
Natural Number Guy
add a comment
|
add a comment
|
Thanks for contributing an answer to MathOverflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathoverflow.net%2fquestions%2f342966%2fis-data-science-mathematically-interesting%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
7
$begingroup$
Why "at most"? You could also say, and historically people did say: "why have an academic position in computer science rather than organize interdisciplinary collaboration between math and electrical engineering?" The idea of a distinct intellectual endeavor can follow the creation of a department.
$endgroup$
– Matt F.
Oct 2 at 17:09
6
$begingroup$
@MattF. Well this is why I’m asking the question! There is an answer to give for computer science. How about data science?
$endgroup$
– Monroe Eskew
Oct 2 at 17:18
9
$begingroup$
Even if some mathematicians do not find data science interesting...there are still people (including me) who do data science, and find it interesting, and use some math, and have things in the area to teach students, which the students find valuable, and which are as testable and certifiable as anything else in academia.That and the economic incentives together justify departments of data science to me.
$endgroup$
– Matt F.
Oct 2 at 17:46
1
$begingroup$
There are some good answers here, so just a comment. A key part of data science is use of neural nets. Given a problem, we need to define a net with a suitable architecture, train it, and voila, we are done. Only problem: what is "suitable". Currently this is done more by trial and error than on the basis of any proper theory.
$endgroup$
– Keith
Oct 4 at 6:27
2
$begingroup$
Much as we would like to think that different academic departments represent distinct intellectual endeavors (perhaps in some Platonic heaven where true knowledge is neatly separated into disjoint buckets?), I think the reality is that pragmatic considerations dominate. In practice it is hard to get tenure and promotion doing only interdisciplinary work, so it tends to be neglected in favor of activities that earn more respect. If you want some area to get a lot of attention, then the best strategy is usually to create a new department, regardless of what things look like in Platonic heaven.
$endgroup$
– Timothy Chow
Oct 4 at 15:58