Can a neural network compute $y = x^2$?Debugging Neural Network for (Natural Language) TaggingIs ML a good solution for identifying what the user wants to do from a sentence?Which functions neural net can't approximateQ Learning Neural network for tic tac toe Input implementation problemError in Neural NetworkWhat database should I use?Reinforcement learning - How to deal with varying number of actions which do number approximationMultiple-input multiple-output CNN with custom loss functionWhy are neuron activations stored as a column vector?Learning a highly non-linear function with a small data set

What if a revenant (monster) gains fire resistance?

Yosemite Fire Rings - What to Expect?

Is there a working SACD iso player for Ubuntu?

Creepy dinosaur pc game identification

Did arcade monitors have same pixel aspect ratio as TV sets?

Is "staff" singular or plural?

Creature in Shazam mid-credits scene?

What is the evidence for the "tyranny of the majority problem" in a direct democracy context?

250 Floor Tower

Should I outline or discovery write my stories?

I am looking for the correct translation of love for the phrase "in this sign love"

On a tidally locked planet, would time be quantized?

Why should universal income be universal?

How do I color the graph in datavisualization?

Has any country ever had 2 former presidents in jail simultaneously?

If a character has darkvision, can they see through an area of nonmagical darkness filled with lightly obscuring gas?

What is this called? Old film camera viewer?

Is it possible to put a rectangle as background in the author section?

Removing files under particular conditions (number of files, file age)

Longest common substring in linear time

What should you do if you miss a job interview (deliberately)?

Fear of getting stuck on one programming language / technology that is not used in my country

Does an advisor owe his/her student anything? Will an advisor keep a PhD student only out of pity?

Why did the HMS Bounty go back to a time when whales are already rare?

Can a neural network compute $y = x^2$?

Debugging Neural Network for (Natural Language) TaggingIs ML a good solution for identifying what the user wants to do from a sentence?Which functions neural net can't approximateQ Learning Neural network for tic tac toe Input implementation problemError in Neural NetworkWhat database should I use?Reinforcement learning - How to deal with varying number of actions which do number approximationMultiple-input multiple-output CNN with custom loss functionWhy are neuron activations stored as a column vector?Learning a highly non-linear function with a small data set

In spirit of the famous Tensorflow Fizz Buzz joke and XOr problem I started to think, if it's possible to design a neural network that implements $y = x^2$ function?

Given some representation of a number (e.g. as a vector in binary form, so that number 5 is represented as [1,0,1,0,0,0,0,...]), the neural network should learn to return its square - 25 in this case.

If I could implement $y=x^2$, I could probably implement $y=x^3$ and generally any polynomial of x, and then with Taylor series I could approximate $y=sin(x)$, which would solve the Fizz Buzz problem - a neural network that can find remainder of the division.

Clearly, just the linear part of NNs won't be able to perform this task, so if we could do the multiplication, it would be happening thanks to activation function.

Can you suggest any ideas or reading on subject?

edited yesterday

asked yesterday

Boris Burkov

1335

New contributor

add a comment |

In spirit of the famous Tensorflow Fizz Buzz joke and XOr problem I started to think, if it's possible to design a neural network that implements $y = x^2$ function?

Clearly, just the linear part of NNs won't be able to perform this task, so if we could do the multiplication, it would be happening thanks to activation function.

Can you suggest any ideas or reading on subject?

edited yesterday

asked yesterday

Boris Burkov

1335

New contributor

add a comment |

In spirit of the famous Tensorflow Fizz Buzz joke and XOr problem I started to think, if it's possible to design a neural network that implements $y = x^2$ function?

Clearly, just the linear part of NNs won't be able to perform this task, so if we could do the multiplication, it would be happening thanks to activation function.

Can you suggest any ideas or reading on subject?

edited yesterday

asked yesterday

Boris Burkov

1335

New contributor

In spirit of the famous Tensorflow Fizz Buzz joke and XOr problem I started to think, if it's possible to design a neural network that implements $y = x^2$ function?

Clearly, just the linear part of NNs won't be able to perform this task, so if we could do the multiplication, it would be happening thanks to activation function.

Can you suggest any ideas or reading on subject?

machine-learning neural-network

edited yesterday

asked yesterday

Boris Burkov

1335

New contributor

edited yesterday

asked yesterday

Boris Burkov

1335

New contributor

edited yesterday

asked yesterday

Boris Burkov

1335

New contributor

asked yesterday

Boris Burkov

1335

asked yesterday

Boris Burkov

1335

New contributor

Boris Burkov is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

2 Answers
2

active

oldest

votes

Neural networks are also called as the universal function approximation which is based in the universal function approximation theorem. It states that :

In the mathematical theory of artificial neural networks,
the universal approximation theorem states that a feed-forward network
with a single hidden layer containing a finite number of neurons can
approximate continuous functions on compact subsets of Rn, under mild
assumptions on the activation function

Meaning a ANN with a non linear activation function could map the function which relates the input with the output. The function y = x^2 could be easily approximated using regression ANN.

You can find an excellent lesson here with a notebook example.

Also, because of such ability ANN could map complex relationships for example between an image and its labels.

answered yesterday

Shubham Panchal

35117

2

$begingroup$
Thank you very much, this is exactly what I was asking for!
$endgroup$
– Boris Burkov
yesterday

2

$begingroup$
Although true, it a very bad idea to learn that. I fail to see where any generalization power would arise from. NN shine when there's something to generalize. Like CNN for vision that capture patterns, or RNN that can capture trends.
$endgroup$
– Jeffrey
yesterday

add a comment |

I think the answer of @ShubhamPanchal is a little bit misleading. Yes, it is true that by Cybenko's universal approximation theorem we can approximate $f(x)=x^2$ with a single hidden layer containing a finite number of neurons can approximate continuous functions on compact subsets of $mathbbR^n$, under mild assumptions on the activation function.

But the main problem is that the theorem has a very important
limitation. The function needs to be defined on compact subsets of
$mathbbR^n$ (compact subset = bounded + closed subset). But why
is this problematic?. When training the function approximator you
will always have a finite data set. Hence, you will approximate the
function inside a compact subset of $mathbbR^n$. But we can always
find a point $x$ for which the approximation will probably fail. That
being said. If you only want to approximate $f(x)=x^2$ on a compact
subset of $mathbbR$ then we can answer your question with yes.
But if you want to approximate $f(x)=x^2$ for all $xin mathbbR$
then the answer is no (I exclude the trivial case in which you use
a quadratic activation function).

Side remark on Taylor approximation: You always have to keep in mind that a Taylor approximation is only a local approximation. If you only want to approximate a function in a predefined region then you should be able to use Taylor series. But approximating $sin(x)$ by the Taylor series evaluated at $x=0$ will give you horrible results for $xto 10000$ if you don't use enough terms in your Taylor expansion.

edited yesterday

answered yesterday

MachineLearner

30410

New contributor

1

$begingroup$
Nice catch! "compact set".
$endgroup$
– Esmailian
yesterday

$begingroup$
Many thanks, mate! Eye-opener!
$endgroup$
– Boris Burkov
yesterday

$begingroup$
@Esmailian: Thank you :).
$endgroup$
– MachineLearner
yesterday

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

Boris Burkov is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47787%2fcan-a-neural-network-compute-y-x2%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Neural networks are also called as the universal function approximation which is based in the universal function approximation theorem. It states that :

In the mathematical theory of artificial neural networks,
the universal approximation theorem states that a feed-forward network
with a single hidden layer containing a finite number of neurons can
approximate continuous functions on compact subsets of Rn, under mild
assumptions on the activation function

Meaning a ANN with a non linear activation function could map the function which relates the input with the output. The function y = x^2 could be easily approximated using regression ANN.

You can find an excellent lesson here with a notebook example.

Also, because of such ability ANN could map complex relationships for example between an image and its labels.

answered yesterday

Shubham Panchal

35117

2

$begingroup$
Thank you very much, this is exactly what I was asking for!
$endgroup$
– Boris Burkov
yesterday

2

$begingroup$
Although true, it a very bad idea to learn that. I fail to see where any generalization power would arise from. NN shine when there's something to generalize. Like CNN for vision that capture patterns, or RNN that can capture trends.
$endgroup$
– Jeffrey
yesterday

add a comment |

Neural networks are also called as the universal function approximation which is based in the universal function approximation theorem. It states that :

In the mathematical theory of artificial neural networks,
the universal approximation theorem states that a feed-forward network
with a single hidden layer containing a finite number of neurons can
approximate continuous functions on compact subsets of Rn, under mild
assumptions on the activation function

Meaning a ANN with a non linear activation function could map the function which relates the input with the output. The function y = x^2 could be easily approximated using regression ANN.

You can find an excellent lesson here with a notebook example.

Also, because of such ability ANN could map complex relationships for example between an image and its labels.

answered yesterday

Shubham Panchal

35117

2

$begingroup$
Thank you very much, this is exactly what I was asking for!
$endgroup$
– Boris Burkov
yesterday

2

$begingroup$
Although true, it a very bad idea to learn that. I fail to see where any generalization power would arise from. NN shine when there's something to generalize. Like CNN for vision that capture patterns, or RNN that can capture trends.
$endgroup$
– Jeffrey
yesterday

add a comment |

Neural networks are also called as the universal function approximation which is based in the universal function approximation theorem. It states that :

In the mathematical theory of artificial neural networks,
the universal approximation theorem states that a feed-forward network
with a single hidden layer containing a finite number of neurons can
approximate continuous functions on compact subsets of Rn, under mild
assumptions on the activation function

Meaning a ANN with a non linear activation function could map the function which relates the input with the output. The function y = x^2 could be easily approximated using regression ANN.

You can find an excellent lesson here with a notebook example.

Also, because of such ability ANN could map complex relationships for example between an image and its labels.

answered yesterday

Shubham Panchal

35117

Neural networks are also called as the universal function approximation which is based in the universal function approximation theorem. It states that :

In the mathematical theory of artificial neural networks,
the universal approximation theorem states that a feed-forward network
with a single hidden layer containing a finite number of neurons can
approximate continuous functions on compact subsets of Rn, under mild
assumptions on the activation function

Meaning a ANN with a non linear activation function could map the function which relates the input with the output. The function y = x^2 could be easily approximated using regression ANN.

You can find an excellent lesson here with a notebook example.

Also, because of such ability ANN could map complex relationships for example between an image and its labels.

answered yesterday

Shubham Panchal

35117

answered yesterday

Shubham Panchal

35117

answered yesterday

Shubham Panchal

35117

answered yesterday

Shubham Panchal

35117

2

$begingroup$
Thank you very much, this is exactly what I was asking for!
$endgroup$
– Boris Burkov
yesterday

2

$begingroup$
Although true, it a very bad idea to learn that. I fail to see where any generalization power would arise from. NN shine when there's something to generalize. Like CNN for vision that capture patterns, or RNN that can capture trends.
$endgroup$
– Jeffrey
yesterday

add a comment |

2

$begingroup$
Thank you very much, this is exactly what I was asking for!
$endgroup$
– Boris Burkov
yesterday

2

$begingroup$
Although true, it a very bad idea to learn that. I fail to see where any generalization power would arise from. NN shine when there's something to generalize. Like CNN for vision that capture patterns, or RNN that can capture trends.
$endgroup$
– Jeffrey
yesterday

Thank you very much, this is exactly what I was asking for!

– Boris Burkov
yesterday

Although true, it a very bad idea to learn that. I fail to see where any generalization power would arise from. NN shine when there's something to generalize. Like CNN for vision that capture patterns, or RNN that can capture trends.

– Jeffrey
yesterday

add a comment |

But the main problem is that the theorem has a very important
limitation. The function needs to be defined on compact subsets of
$mathbbR^n$ (compact subset = bounded + closed subset). But why
is this problematic?. When training the function approximator you
will always have a finite data set. Hence, you will approximate the
function inside a compact subset of $mathbbR^n$. But we can always
find a point $x$ for which the approximation will probably fail. That
being said. If you only want to approximate $f(x)=x^2$ on a compact
subset of $mathbbR$ then we can answer your question with yes.
But if you want to approximate $f(x)=x^2$ for all $xin mathbbR$
then the answer is no (I exclude the trivial case in which you use
a quadratic activation function).

edited yesterday

answered yesterday

MachineLearner

30410

New contributor

1

$begingroup$
Nice catch! "compact set".
$endgroup$
– Esmailian
yesterday

$begingroup$
Many thanks, mate! Eye-opener!
$endgroup$
– Boris Burkov
yesterday

$begingroup$
@Esmailian: Thank you :).
$endgroup$
– MachineLearner
yesterday

add a comment |

But the main problem is that the theorem has a very important
limitation. The function needs to be defined on compact subsets of
$mathbbR^n$ (compact subset = bounded + closed subset). But why
is this problematic?. When training the function approximator you
will always have a finite data set. Hence, you will approximate the
function inside a compact subset of $mathbbR^n$. But we can always
find a point $x$ for which the approximation will probably fail. That
being said. If you only want to approximate $f(x)=x^2$ on a compact
subset of $mathbbR$ then we can answer your question with yes.
But if you want to approximate $f(x)=x^2$ for all $xin mathbbR$
then the answer is no (I exclude the trivial case in which you use
a quadratic activation function).

edited yesterday

answered yesterday

MachineLearner

30410

New contributor

1

$begingroup$
Nice catch! "compact set".
$endgroup$
– Esmailian
yesterday

$begingroup$
Many thanks, mate! Eye-opener!
$endgroup$
– Boris Burkov
yesterday

$begingroup$
@Esmailian: Thank you :).
$endgroup$
– MachineLearner
yesterday

add a comment |

But the main problem is that the theorem has a very important
limitation. The function needs to be defined on compact subsets of
$mathbbR^n$ (compact subset = bounded + closed subset). But why
is this problematic?. When training the function approximator you
will always have a finite data set. Hence, you will approximate the
function inside a compact subset of $mathbbR^n$. But we can always
find a point $x$ for which the approximation will probably fail. That
being said. If you only want to approximate $f(x)=x^2$ on a compact
subset of $mathbbR$ then we can answer your question with yes.
But if you want to approximate $f(x)=x^2$ for all $xin mathbbR$
then the answer is no (I exclude the trivial case in which you use
a quadratic activation function).

edited yesterday

answered yesterday

MachineLearner

30410

New contributor

But the main problem is that the theorem has a very important
limitation. The function needs to be defined on compact subsets of
$mathbbR^n$ (compact subset = bounded + closed subset). But why
is this problematic?. When training the function approximator you
will always have a finite data set. Hence, you will approximate the
function inside a compact subset of $mathbbR^n$. But we can always
find a point $x$ for which the approximation will probably fail. That
being said. If you only want to approximate $f(x)=x^2$ on a compact
subset of $mathbbR$ then we can answer your question with yes.
But if you want to approximate $f(x)=x^2$ for all $xin mathbbR$
then the answer is no (I exclude the trivial case in which you use
a quadratic activation function).

edited yesterday

answered yesterday

MachineLearner

30410

New contributor

edited yesterday

answered yesterday

MachineLearner

30410

New contributor

answered yesterday

MachineLearner

30410

answered yesterday

MachineLearner

30410

New contributor

MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

1

$begingroup$
Nice catch! "compact set".
$endgroup$
– Esmailian
yesterday

$begingroup$
Many thanks, mate! Eye-opener!
$endgroup$
– Boris Burkov
yesterday

$begingroup$
@Esmailian: Thank you :).
$endgroup$
– MachineLearner
yesterday

add a comment |

1

$begingroup$
Nice catch! "compact set".
$endgroup$
– Esmailian
yesterday

$begingroup$
Many thanks, mate! Eye-opener!
$endgroup$
– Boris Burkov
yesterday

$begingroup$
@Esmailian: Thank you :).
$endgroup$
– MachineLearner
yesterday

Nice catch! "compact set".

– Esmailian
yesterday

Many thanks, mate! Eye-opener!

– Boris Burkov
yesterday

@Esmailian: Thank you :).

– MachineLearner
yesterday

add a comment |

Boris Burkov is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Boris Burkov is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mrthdrb

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

БиармияSxpst500bh2ntaf! 3h2r

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

БиармияSxpst500bh2ntaf! 3h2r

2 Answers
2

2 Answers
2

2 Answers
2