How do you determine which representation of a function to use for Newton's method?efficient algorithm for solving equation $sum_n a_n/(x - x_n ) = 1$What numerical optimization method to use for this function?Newton's method to solve implicit Runge-Kutta-methodWhy does the Mandelbrot set appear when I use Newton's method to find the inverse of $tan(z)$?Newton-Raphson method for a vector function with root bracketing / root constraint?How to use a tangent circle in a numerical method for complex-valued differential equationNewtons method for finding reciprocalHow to make use of this iterative scheme for solving the neutron diffusion equation?

I can't understand how probability makes sense

Do Klingons have escape pods?

How to prepend a character to start of each line in 250,000+ line file using a script?

What is the difference between "cat < filename" and "cat filename"?

Has the supreme court ever been asked to hear a case directly involving the president who appointed at least one of the Justices?

Is my friend from Egypt with a Schengen visa allowed to visit the UK?

How can you weaponize a thermos?

In a world where Magic steam Engines exist what would keep people from making cars

Can't deploy to app engine standard with gcloud components 272.0.0

MS BASIC, access a DIMed variable with no index?

Because things smell, is everything evaporating?

Select polygons with 5 or more points

Birthplace vs living place

What does "Massage with salt" mean in a recipe?

If password expiration is applied, should door-lock expiration be applied too?

How can I pass a collection of exceptions as a root cause?

Why is one Starlink satellite not following the adjacent one in this image?

Is there a product for channeling water away from water appliances?

Starting with D&D: Starter Set vs Dungeon Master's Guide

Is Chika Ofili's method for checking divisibility for 7 a "new discovery" in math?

Employer wants me to do something explicitly illegal

What makes the planets to be symmetric

What are the units of the product of two signals?

Algorithmic thinking problems



How do you determine which representation of a function to use for Newton's method?


efficient algorithm for solving equation $sum_n a_n/(x - x_n ) = 1$What numerical optimization method to use for this function?Newton's method to solve implicit Runge-Kutta-methodWhy does the Mandelbrot set appear when I use Newton's method to find the inverse of $tan(z)$?Newton-Raphson method for a vector function with root bracketing / root constraint?How to use a tangent circle in a numerical method for complex-valued differential equationNewtons method for finding reciprocalHow to make use of this iterative scheme for solving the neutron diffusion equation?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;

.everyonelovesstackoverflowposition:absolute;height:1px;width:1px;opacity:0;top:0;left:0;pointer-events:none;








18















$begingroup$


Take the equation:



$$x^2 + 2x = frac11+x^2$$



I subtracted the right term over to form $~f_1(x)~$:



$$x^2 + 2x - frac11+x^2 = 0$$



I wanted to take the derivative, so I rearranged things to make it a bit easier, call it $~f_2(x)~$:



$$x^4 + 2x^3 + x^2 + 2x - 1 = 0$$



I noticed when I graphed $~f_1(x)~$ and $~f_2(x)~$ that their plots were different $($although they shared the same solution for $~x)~$.



Newton's method iterates down the graph line, so I'd imagine that Newton's method for these two equations are not equivalent. They'd find the same solution, but they would get there different ways. In that case, is there a way to decide which equation to use for Newton's to obtain the best/quickest result?










share|cite|improve this question











$endgroup$










  • 1




    $begingroup$
    To restate, an equation can be rewritten in various ways to express the solution as the root (zero) of a function. You ask about Newton's method, but note that it is a special case of finding a fixed point of a function by iteration. So already Newton's method can be said to involve rewriting an equation to expedite convergence to a root.
    $endgroup$
    – hardmath
    Sep 16 at 14:44










  • $begingroup$
    Related to your question, see this post for a proof of a simple condition that guarantees the desired quadratic convergence of Newton's method to a root $r$ of $f$: $f'(r) ≠ 0$ and there is some constant $m$ such that $|f''(x)/f'(r)| ≤ m$ for every $x∈dom(f)$, and your initial guess is at most $1/(3m)$ away from $r$.
    $endgroup$
    – user21820
    Sep 17 at 12:41










  • $begingroup$
    math.stackexchange.com/questions/3078167/… is a rather good example of what I have been doing for 60+ years.
    $endgroup$
    – Claude Leibovici
    Sep 17 at 15:13










  • $begingroup$
    To give you an idea of the impact. Almost 25 years ago, I spent something like three months of work (full time) trying to make a function "more" linear. I saved $0.001$ seconds per run (CPU time). Big achievement, isn't it ? The only problem is that a single simulation was solving that equation $sim 10 ^7$ times.which means that we saved almost $3$ hours per simulation.
    $endgroup$
    – Claude Leibovici
    Sep 18 at 9:42

















18















$begingroup$


Take the equation:



$$x^2 + 2x = frac11+x^2$$



I subtracted the right term over to form $~f_1(x)~$:



$$x^2 + 2x - frac11+x^2 = 0$$



I wanted to take the derivative, so I rearranged things to make it a bit easier, call it $~f_2(x)~$:



$$x^4 + 2x^3 + x^2 + 2x - 1 = 0$$



I noticed when I graphed $~f_1(x)~$ and $~f_2(x)~$ that their plots were different $($although they shared the same solution for $~x)~$.



Newton's method iterates down the graph line, so I'd imagine that Newton's method for these two equations are not equivalent. They'd find the same solution, but they would get there different ways. In that case, is there a way to decide which equation to use for Newton's to obtain the best/quickest result?










share|cite|improve this question











$endgroup$










  • 1




    $begingroup$
    To restate, an equation can be rewritten in various ways to express the solution as the root (zero) of a function. You ask about Newton's method, but note that it is a special case of finding a fixed point of a function by iteration. So already Newton's method can be said to involve rewriting an equation to expedite convergence to a root.
    $endgroup$
    – hardmath
    Sep 16 at 14:44










  • $begingroup$
    Related to your question, see this post for a proof of a simple condition that guarantees the desired quadratic convergence of Newton's method to a root $r$ of $f$: $f'(r) ≠ 0$ and there is some constant $m$ such that $|f''(x)/f'(r)| ≤ m$ for every $x∈dom(f)$, and your initial guess is at most $1/(3m)$ away from $r$.
    $endgroup$
    – user21820
    Sep 17 at 12:41










  • $begingroup$
    math.stackexchange.com/questions/3078167/… is a rather good example of what I have been doing for 60+ years.
    $endgroup$
    – Claude Leibovici
    Sep 17 at 15:13










  • $begingroup$
    To give you an idea of the impact. Almost 25 years ago, I spent something like three months of work (full time) trying to make a function "more" linear. I saved $0.001$ seconds per run (CPU time). Big achievement, isn't it ? The only problem is that a single simulation was solving that equation $sim 10 ^7$ times.which means that we saved almost $3$ hours per simulation.
    $endgroup$
    – Claude Leibovici
    Sep 18 at 9:42













18













18









18


7



$begingroup$


Take the equation:



$$x^2 + 2x = frac11+x^2$$



I subtracted the right term over to form $~f_1(x)~$:



$$x^2 + 2x - frac11+x^2 = 0$$



I wanted to take the derivative, so I rearranged things to make it a bit easier, call it $~f_2(x)~$:



$$x^4 + 2x^3 + x^2 + 2x - 1 = 0$$



I noticed when I graphed $~f_1(x)~$ and $~f_2(x)~$ that their plots were different $($although they shared the same solution for $~x)~$.



Newton's method iterates down the graph line, so I'd imagine that Newton's method for these two equations are not equivalent. They'd find the same solution, but they would get there different ways. In that case, is there a way to decide which equation to use for Newton's to obtain the best/quickest result?










share|cite|improve this question











$endgroup$




Take the equation:



$$x^2 + 2x = frac11+x^2$$



I subtracted the right term over to form $~f_1(x)~$:



$$x^2 + 2x - frac11+x^2 = 0$$



I wanted to take the derivative, so I rearranged things to make it a bit easier, call it $~f_2(x)~$:



$$x^4 + 2x^3 + x^2 + 2x - 1 = 0$$



I noticed when I graphed $~f_1(x)~$ and $~f_2(x)~$ that their plots were different $($although they shared the same solution for $~x)~$.



Newton's method iterates down the graph line, so I'd imagine that Newton's method for these two equations are not equivalent. They'd find the same solution, but they would get there different ways. In that case, is there a way to decide which equation to use for Newton's to obtain the best/quickest result?







linear-algebra numerical-methods






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Sep 16 at 6:41







Sunden

















asked Sep 16 at 3:14









SundenSunden

2836 bronze badges




2836 bronze badges










  • 1




    $begingroup$
    To restate, an equation can be rewritten in various ways to express the solution as the root (zero) of a function. You ask about Newton's method, but note that it is a special case of finding a fixed point of a function by iteration. So already Newton's method can be said to involve rewriting an equation to expedite convergence to a root.
    $endgroup$
    – hardmath
    Sep 16 at 14:44










  • $begingroup$
    Related to your question, see this post for a proof of a simple condition that guarantees the desired quadratic convergence of Newton's method to a root $r$ of $f$: $f'(r) ≠ 0$ and there is some constant $m$ such that $|f''(x)/f'(r)| ≤ m$ for every $x∈dom(f)$, and your initial guess is at most $1/(3m)$ away from $r$.
    $endgroup$
    – user21820
    Sep 17 at 12:41










  • $begingroup$
    math.stackexchange.com/questions/3078167/… is a rather good example of what I have been doing for 60+ years.
    $endgroup$
    – Claude Leibovici
    Sep 17 at 15:13










  • $begingroup$
    To give you an idea of the impact. Almost 25 years ago, I spent something like three months of work (full time) trying to make a function "more" linear. I saved $0.001$ seconds per run (CPU time). Big achievement, isn't it ? The only problem is that a single simulation was solving that equation $sim 10 ^7$ times.which means that we saved almost $3$ hours per simulation.
    $endgroup$
    – Claude Leibovici
    Sep 18 at 9:42












  • 1




    $begingroup$
    To restate, an equation can be rewritten in various ways to express the solution as the root (zero) of a function. You ask about Newton's method, but note that it is a special case of finding a fixed point of a function by iteration. So already Newton's method can be said to involve rewriting an equation to expedite convergence to a root.
    $endgroup$
    – hardmath
    Sep 16 at 14:44










  • $begingroup$
    Related to your question, see this post for a proof of a simple condition that guarantees the desired quadratic convergence of Newton's method to a root $r$ of $f$: $f'(r) ≠ 0$ and there is some constant $m$ such that $|f''(x)/f'(r)| ≤ m$ for every $x∈dom(f)$, and your initial guess is at most $1/(3m)$ away from $r$.
    $endgroup$
    – user21820
    Sep 17 at 12:41










  • $begingroup$
    math.stackexchange.com/questions/3078167/… is a rather good example of what I have been doing for 60+ years.
    $endgroup$
    – Claude Leibovici
    Sep 17 at 15:13










  • $begingroup$
    To give you an idea of the impact. Almost 25 years ago, I spent something like three months of work (full time) trying to make a function "more" linear. I saved $0.001$ seconds per run (CPU time). Big achievement, isn't it ? The only problem is that a single simulation was solving that equation $sim 10 ^7$ times.which means that we saved almost $3$ hours per simulation.
    $endgroup$
    – Claude Leibovici
    Sep 18 at 9:42







1




1




$begingroup$
To restate, an equation can be rewritten in various ways to express the solution as the root (zero) of a function. You ask about Newton's method, but note that it is a special case of finding a fixed point of a function by iteration. So already Newton's method can be said to involve rewriting an equation to expedite convergence to a root.
$endgroup$
– hardmath
Sep 16 at 14:44




$begingroup$
To restate, an equation can be rewritten in various ways to express the solution as the root (zero) of a function. You ask about Newton's method, but note that it is a special case of finding a fixed point of a function by iteration. So already Newton's method can be said to involve rewriting an equation to expedite convergence to a root.
$endgroup$
– hardmath
Sep 16 at 14:44












$begingroup$
Related to your question, see this post for a proof of a simple condition that guarantees the desired quadratic convergence of Newton's method to a root $r$ of $f$: $f'(r) ≠ 0$ and there is some constant $m$ such that $|f''(x)/f'(r)| ≤ m$ for every $x∈dom(f)$, and your initial guess is at most $1/(3m)$ away from $r$.
$endgroup$
– user21820
Sep 17 at 12:41




$begingroup$
Related to your question, see this post for a proof of a simple condition that guarantees the desired quadratic convergence of Newton's method to a root $r$ of $f$: $f'(r) ≠ 0$ and there is some constant $m$ such that $|f''(x)/f'(r)| ≤ m$ for every $x∈dom(f)$, and your initial guess is at most $1/(3m)$ away from $r$.
$endgroup$
– user21820
Sep 17 at 12:41












$begingroup$
math.stackexchange.com/questions/3078167/… is a rather good example of what I have been doing for 60+ years.
$endgroup$
– Claude Leibovici
Sep 17 at 15:13




$begingroup$
math.stackexchange.com/questions/3078167/… is a rather good example of what I have been doing for 60+ years.
$endgroup$
– Claude Leibovici
Sep 17 at 15:13












$begingroup$
To give you an idea of the impact. Almost 25 years ago, I spent something like three months of work (full time) trying to make a function "more" linear. I saved $0.001$ seconds per run (CPU time). Big achievement, isn't it ? The only problem is that a single simulation was solving that equation $sim 10 ^7$ times.which means that we saved almost $3$ hours per simulation.
$endgroup$
– Claude Leibovici
Sep 18 at 9:42




$begingroup$
To give you an idea of the impact. Almost 25 years ago, I spent something like three months of work (full time) trying to make a function "more" linear. I saved $0.001$ seconds per run (CPU time). Big achievement, isn't it ? The only problem is that a single simulation was solving that equation $sim 10 ^7$ times.which means that we saved almost $3$ hours per simulation.
$endgroup$
– Claude Leibovici
Sep 18 at 9:42










4 Answers
4






active

oldest

votes


















37

















$begingroup$

For your curiosity.



Asking this far-from-innocent question, you are almost asking me what I have been doing during the last sixty years.



My answer is : what is the transform of $f(x)$ which makes it the most linear around the solution ? If you find it, you will save a lot of iterations.



One case I enjoy for demonstration is
$$f(x)=sum_i=1^n a_i^x- b^x$$ where $1<a_1<a_2< cdots < a_n$ and $b > a_n$; on this site, you could find many problems of this kind.



As written, this function is very bad since it varies very fast and it is very nonlinear. Moreover, it goes through a maximum (not easy to identify) and you must start on the right of it to converge.



Now, consider the transform
$$g(x)=logleft(sum_i=1^n a_i^x right)- xlog(b)$$ It is almost a straight line ! This means that you can start from almost anywhere and converge fast.



For illustration purposes, trying with $n=6$, $a_i=p_i$ and $b=p_n+1$ ($p_i$ being the $i^th$ prime number). Being very lazy and using $x_0=0$, the iterates would be
$$left(
beginarraycc
n & x_n \
0 & 0 \
1 & 1.607120621 \
2 & 2.430003204 \
3 & 2.545372693 \
4 & 2.546847896 \
5 & 2.546848123
endarray
right)$$
Using $f(x)$, you must start iterating with $x_0 > 2.14$ to have convergence (big work to know it !). Let us try with $x_0=2.2$ to get as successive iterates



$$left(
beginarraycc
n & x_n \
0 & 2.200000000 \
1 & 4.561765400 \
2 & 4.241750505 \
3 & 3.929819520 \
4 & 3.629031502 \
5 & 3.344096332 \
6 & 3.082467015 \
7 & 2.856023930 \
8 & 2.682559774 \
9 & 2.581375720 \
10 & 2.549595979 \
11 & 2.546866878 \
12 & 2.546848124 \
13 & 2.546848123
endarray
right)$$



The last point I would like to mention : even if you have a good guess $x_0$ of the solution, search for a transform $g(x)$ such that $g(x_0),g''(x_0) >0$ (this is Darboux theorem) in order to avoid any overshoot of the solution during the path to it.






share|cite|improve this answer












$endgroup$














  • $begingroup$
    Nice (+1)! In your example, I think the notation $p_i$ means the $i$th prime? So for $n=6$, you'd have $(a_1, a_2, ldots, a_6) = (2, 3, ldots, 13)$, and $b=17$?
    $endgroup$
    – mathmandan
    Sep 16 at 16:52










  • $begingroup$
    @mathmandan. This is correct. Thanks & cheers
    $endgroup$
    – Claude Leibovici
    Sep 16 at 17:18






  • 1




    $begingroup$
    I think you meant to write $$f(x)=fracsum_i=1^n a_i^xb^x$$ (with division rather than subtraction). Otherwise your "transform" to $g(x)$ gives a more-or-less unrelated function; no?)
    $endgroup$
    – ruakh
    Sep 17 at 18:49










  • $begingroup$
    @ruakh. Not at all. What you suggest is better that the original function but is almost hyperbolic while $g(x)$ is almost linear.
    $endgroup$
    – Claude Leibovici
    Sep 18 at 3:23










  • $begingroup$
    @ClaudeLeibovici: Oh, wait, I see. You never mention what equation you were solving, and I made a bad assumption: I thought you were solving the equation $f(x) = y_0$ for some (arbitrary) $y_0$, and I didn't see what was gained by "transforming" that to $g(x) = z_0$ unless there was some relationship between $g$ and $f$ (such as $g(x) = log f(x)$). I now realize that you were solving the equation $f(x) = 0$, so your "transformation" involves converting that to $f_A(x) = f_B(x)$, then to $g_A(x) = g_B(x)$, then to $g(x) = 0$, where $g_A(x) = log f_A(x)$ and $g_B(x) = log f_B(x)$.
    $endgroup$
    – ruakh
    Sep 18 at 4:20


















10

















$begingroup$

Newton's method is very robust once you are near a root - exactly what function you chose to run it on is not of that much consequence; it will converge quickly no matter what. You might as well just use whatever is easiest.



To be more precise, let $f$ be the function we wish to find a root of and define $g(x)=x-fracf(x)f'(x)$ to be the function performing a single step of Newton's method. First: it should be clear that things go badly if $f'$ and $f$ share a root, so let's assume they don't - and we'll assume that $f$ is smooth near the root, for convenience.



The idea is that $g$ pulls everything near a root to the root - and to quantify how fast, we can take a look at its Taylor series at a root. Let $x_0$ be a root of $f$. If we differentiate $g$ then simplify at $x_0$ by noting that $f(x_0)=0$, we can see the following:
$$g(x_0)=x_0$$
$$g'(x_0)=0$$
$$g''(x_0)=fracf''(x_0)f'(x_0)$$
In particular, this tells us that, once we are close enough to $x_0$, if the current amount of error is $e$, one more application of $g$ makes the error something more like $g''(x_0)e^2$ - this is pretty fantastic irrespective of what $g''(x_0)$ is - if you take a small number and square it over and over and over, it gets really small really fast - which is what we want to happen to the error. I suppose if you were really eager, you could try to make $g''(x_0)$ small by choosing $f$ carefully - but realistically, it does not matter, since it tends to be the case that running an additional step of the method completely dwarfs any gains from fine-tuning $f$.




This said, Newton's method can behave pretty badly on a global scale - even for polynomials it can do really nasty things*. Choosing your function carefully isn't likely to help you much here either, since when functions have really small derivatives, the method is unstable, but when they have rapidly growing derivatives (like near an asymptote), the method converges very very slowly - basically, this isn't a good method to look for a root without having any idea of where one might be, and fine tuning $f$ won't help you there either.



(*Example: Look up "Newton's method basins of attraction" on Google - you'll get these amazing plots of where the method converges if you run it on polynomials as simple as $x^3-1$ starting at various points in the complex plane. Basically, when it's not near a root, anything can happen)






share|cite|improve this answer










$endgroup$










  • 4




    $begingroup$
    There are also functions, for which Newton's method diverges everywhere, also near the root. See e.g. $$y=sqrt lvert xrvert$$ or $$y=operatornamesgn(x)cdotsqrt lvert xrvert$$ for which Newton's method jumps between $a$ and $-a$ for any starting point $ane 0.$
    $endgroup$
    – CiaPan
    Sep 16 at 14:38











  • $begingroup$
    If $f$ has a root of multiplicity strictly greater than one, then Newton's method converges linearly. A method of order three or higher is worthwhile when very high accuracy is sort, i.e., thousands of significant figures.
    $endgroup$
    – Carl Christian
    Sep 16 at 19:48


















5

















$begingroup$

It depends on the derivative of your function at the zeros of your function.



The formula suggests that $$x_n+1 = x_n-frac f(x_n)f'(x_n)$$



Thus you get a better (faster) result if the derivative of your function is higher at the zeros of $f(x)$.



The process becomes very chaotic if your derivative at a
staritng point is close to zero.



As a very interesting case you may consider $$y=sin x$$ and try to find $x=pi$ starting at a point close to $pi/2$



Without knowing the zeros of $f(x)$ making a decision is not an easy task.






share|cite|improve this answer












$endgroup$






















    2

















    $begingroup$

    It is not difficult to show that there are just two real solutions, since




    • $x^2+2x$ is decreasing from $+infty$ to $-1$ on $(-infty,-1]$, while $frac1x^2+1$ is increasing from $0$ to $frac12$;

    • over $[-1,0]$ we have $x^2+2xleq 0$ but $frac1x^2+1>0$;


    • $x^2+2x$ is increasing from $0$ to $+infty$ on $mathbbR^+$, while $frac1x^2+1$ is decreasing from $1$ to $0$.

    Let's say we are interested in finding the positive root, which lies in $I=left[0,frac12right]$.

    Over such interval $a(x)=x^2+2x$ is increasing and convex, $b(x)=frac1x^2+1$ is decreasing and concave.

    This suggests a tweak of the Newton-Raphson iteration:



    enter image description here



    We are interested in the intersection between the blue and orange curves, which lies on the left of the intersection between the sienna and the purple tangent lines. By solving
    $$ 3x-frac14 = frac2825-frac1625x $$
    we find that $frac137364$ is already a good approximation of the solution in $[0,1]$. Newton's method (applied to $a(x)-b(x)$) with such starting point is granted to converge quadratically (and in a monotonic way) to the actual solution $0.3708102352ldots$






    share|cite|improve this answer










    $endgroup$
















      Your Answer








      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "69"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      noCode: true, onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );














      draft saved

      draft discarded
















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3358137%2fhow-do-you-determine-which-representation-of-a-function-to-use-for-newtons-meth%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown


























      4 Answers
      4






      active

      oldest

      votes








      4 Answers
      4






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      37

















      $begingroup$

      For your curiosity.



      Asking this far-from-innocent question, you are almost asking me what I have been doing during the last sixty years.



      My answer is : what is the transform of $f(x)$ which makes it the most linear around the solution ? If you find it, you will save a lot of iterations.



      One case I enjoy for demonstration is
      $$f(x)=sum_i=1^n a_i^x- b^x$$ where $1<a_1<a_2< cdots < a_n$ and $b > a_n$; on this site, you could find many problems of this kind.



      As written, this function is very bad since it varies very fast and it is very nonlinear. Moreover, it goes through a maximum (not easy to identify) and you must start on the right of it to converge.



      Now, consider the transform
      $$g(x)=logleft(sum_i=1^n a_i^x right)- xlog(b)$$ It is almost a straight line ! This means that you can start from almost anywhere and converge fast.



      For illustration purposes, trying with $n=6$, $a_i=p_i$ and $b=p_n+1$ ($p_i$ being the $i^th$ prime number). Being very lazy and using $x_0=0$, the iterates would be
      $$left(
      beginarraycc
      n & x_n \
      0 & 0 \
      1 & 1.607120621 \
      2 & 2.430003204 \
      3 & 2.545372693 \
      4 & 2.546847896 \
      5 & 2.546848123
      endarray
      right)$$
      Using $f(x)$, you must start iterating with $x_0 > 2.14$ to have convergence (big work to know it !). Let us try with $x_0=2.2$ to get as successive iterates



      $$left(
      beginarraycc
      n & x_n \
      0 & 2.200000000 \
      1 & 4.561765400 \
      2 & 4.241750505 \
      3 & 3.929819520 \
      4 & 3.629031502 \
      5 & 3.344096332 \
      6 & 3.082467015 \
      7 & 2.856023930 \
      8 & 2.682559774 \
      9 & 2.581375720 \
      10 & 2.549595979 \
      11 & 2.546866878 \
      12 & 2.546848124 \
      13 & 2.546848123
      endarray
      right)$$



      The last point I would like to mention : even if you have a good guess $x_0$ of the solution, search for a transform $g(x)$ such that $g(x_0),g''(x_0) >0$ (this is Darboux theorem) in order to avoid any overshoot of the solution during the path to it.






      share|cite|improve this answer












      $endgroup$














      • $begingroup$
        Nice (+1)! In your example, I think the notation $p_i$ means the $i$th prime? So for $n=6$, you'd have $(a_1, a_2, ldots, a_6) = (2, 3, ldots, 13)$, and $b=17$?
        $endgroup$
        – mathmandan
        Sep 16 at 16:52










      • $begingroup$
        @mathmandan. This is correct. Thanks & cheers
        $endgroup$
        – Claude Leibovici
        Sep 16 at 17:18






      • 1




        $begingroup$
        I think you meant to write $$f(x)=fracsum_i=1^n a_i^xb^x$$ (with division rather than subtraction). Otherwise your "transform" to $g(x)$ gives a more-or-less unrelated function; no?)
        $endgroup$
        – ruakh
        Sep 17 at 18:49










      • $begingroup$
        @ruakh. Not at all. What you suggest is better that the original function but is almost hyperbolic while $g(x)$ is almost linear.
        $endgroup$
        – Claude Leibovici
        Sep 18 at 3:23










      • $begingroup$
        @ClaudeLeibovici: Oh, wait, I see. You never mention what equation you were solving, and I made a bad assumption: I thought you were solving the equation $f(x) = y_0$ for some (arbitrary) $y_0$, and I didn't see what was gained by "transforming" that to $g(x) = z_0$ unless there was some relationship between $g$ and $f$ (such as $g(x) = log f(x)$). I now realize that you were solving the equation $f(x) = 0$, so your "transformation" involves converting that to $f_A(x) = f_B(x)$, then to $g_A(x) = g_B(x)$, then to $g(x) = 0$, where $g_A(x) = log f_A(x)$ and $g_B(x) = log f_B(x)$.
        $endgroup$
        – ruakh
        Sep 18 at 4:20















      37

















      $begingroup$

      For your curiosity.



      Asking this far-from-innocent question, you are almost asking me what I have been doing during the last sixty years.



      My answer is : what is the transform of $f(x)$ which makes it the most linear around the solution ? If you find it, you will save a lot of iterations.



      One case I enjoy for demonstration is
      $$f(x)=sum_i=1^n a_i^x- b^x$$ where $1<a_1<a_2< cdots < a_n$ and $b > a_n$; on this site, you could find many problems of this kind.



      As written, this function is very bad since it varies very fast and it is very nonlinear. Moreover, it goes through a maximum (not easy to identify) and you must start on the right of it to converge.



      Now, consider the transform
      $$g(x)=logleft(sum_i=1^n a_i^x right)- xlog(b)$$ It is almost a straight line ! This means that you can start from almost anywhere and converge fast.



      For illustration purposes, trying with $n=6$, $a_i=p_i$ and $b=p_n+1$ ($p_i$ being the $i^th$ prime number). Being very lazy and using $x_0=0$, the iterates would be
      $$left(
      beginarraycc
      n & x_n \
      0 & 0 \
      1 & 1.607120621 \
      2 & 2.430003204 \
      3 & 2.545372693 \
      4 & 2.546847896 \
      5 & 2.546848123
      endarray
      right)$$
      Using $f(x)$, you must start iterating with $x_0 > 2.14$ to have convergence (big work to know it !). Let us try with $x_0=2.2$ to get as successive iterates



      $$left(
      beginarraycc
      n & x_n \
      0 & 2.200000000 \
      1 & 4.561765400 \
      2 & 4.241750505 \
      3 & 3.929819520 \
      4 & 3.629031502 \
      5 & 3.344096332 \
      6 & 3.082467015 \
      7 & 2.856023930 \
      8 & 2.682559774 \
      9 & 2.581375720 \
      10 & 2.549595979 \
      11 & 2.546866878 \
      12 & 2.546848124 \
      13 & 2.546848123
      endarray
      right)$$



      The last point I would like to mention : even if you have a good guess $x_0$ of the solution, search for a transform $g(x)$ such that $g(x_0),g''(x_0) >0$ (this is Darboux theorem) in order to avoid any overshoot of the solution during the path to it.






      share|cite|improve this answer












      $endgroup$














      • $begingroup$
        Nice (+1)! In your example, I think the notation $p_i$ means the $i$th prime? So for $n=6$, you'd have $(a_1, a_2, ldots, a_6) = (2, 3, ldots, 13)$, and $b=17$?
        $endgroup$
        – mathmandan
        Sep 16 at 16:52










      • $begingroup$
        @mathmandan. This is correct. Thanks & cheers
        $endgroup$
        – Claude Leibovici
        Sep 16 at 17:18






      • 1




        $begingroup$
        I think you meant to write $$f(x)=fracsum_i=1^n a_i^xb^x$$ (with division rather than subtraction). Otherwise your "transform" to $g(x)$ gives a more-or-less unrelated function; no?)
        $endgroup$
        – ruakh
        Sep 17 at 18:49










      • $begingroup$
        @ruakh. Not at all. What you suggest is better that the original function but is almost hyperbolic while $g(x)$ is almost linear.
        $endgroup$
        – Claude Leibovici
        Sep 18 at 3:23










      • $begingroup$
        @ClaudeLeibovici: Oh, wait, I see. You never mention what equation you were solving, and I made a bad assumption: I thought you were solving the equation $f(x) = y_0$ for some (arbitrary) $y_0$, and I didn't see what was gained by "transforming" that to $g(x) = z_0$ unless there was some relationship between $g$ and $f$ (such as $g(x) = log f(x)$). I now realize that you were solving the equation $f(x) = 0$, so your "transformation" involves converting that to $f_A(x) = f_B(x)$, then to $g_A(x) = g_B(x)$, then to $g(x) = 0$, where $g_A(x) = log f_A(x)$ and $g_B(x) = log f_B(x)$.
        $endgroup$
        – ruakh
        Sep 18 at 4:20













      37















      37











      37







      $begingroup$

      For your curiosity.



      Asking this far-from-innocent question, you are almost asking me what I have been doing during the last sixty years.



      My answer is : what is the transform of $f(x)$ which makes it the most linear around the solution ? If you find it, you will save a lot of iterations.



      One case I enjoy for demonstration is
      $$f(x)=sum_i=1^n a_i^x- b^x$$ where $1<a_1<a_2< cdots < a_n$ and $b > a_n$; on this site, you could find many problems of this kind.



      As written, this function is very bad since it varies very fast and it is very nonlinear. Moreover, it goes through a maximum (not easy to identify) and you must start on the right of it to converge.



      Now, consider the transform
      $$g(x)=logleft(sum_i=1^n a_i^x right)- xlog(b)$$ It is almost a straight line ! This means that you can start from almost anywhere and converge fast.



      For illustration purposes, trying with $n=6$, $a_i=p_i$ and $b=p_n+1$ ($p_i$ being the $i^th$ prime number). Being very lazy and using $x_0=0$, the iterates would be
      $$left(
      beginarraycc
      n & x_n \
      0 & 0 \
      1 & 1.607120621 \
      2 & 2.430003204 \
      3 & 2.545372693 \
      4 & 2.546847896 \
      5 & 2.546848123
      endarray
      right)$$
      Using $f(x)$, you must start iterating with $x_0 > 2.14$ to have convergence (big work to know it !). Let us try with $x_0=2.2$ to get as successive iterates



      $$left(
      beginarraycc
      n & x_n \
      0 & 2.200000000 \
      1 & 4.561765400 \
      2 & 4.241750505 \
      3 & 3.929819520 \
      4 & 3.629031502 \
      5 & 3.344096332 \
      6 & 3.082467015 \
      7 & 2.856023930 \
      8 & 2.682559774 \
      9 & 2.581375720 \
      10 & 2.549595979 \
      11 & 2.546866878 \
      12 & 2.546848124 \
      13 & 2.546848123
      endarray
      right)$$



      The last point I would like to mention : even if you have a good guess $x_0$ of the solution, search for a transform $g(x)$ such that $g(x_0),g''(x_0) >0$ (this is Darboux theorem) in order to avoid any overshoot of the solution during the path to it.






      share|cite|improve this answer












      $endgroup$



      For your curiosity.



      Asking this far-from-innocent question, you are almost asking me what I have been doing during the last sixty years.



      My answer is : what is the transform of $f(x)$ which makes it the most linear around the solution ? If you find it, you will save a lot of iterations.



      One case I enjoy for demonstration is
      $$f(x)=sum_i=1^n a_i^x- b^x$$ where $1<a_1<a_2< cdots < a_n$ and $b > a_n$; on this site, you could find many problems of this kind.



      As written, this function is very bad since it varies very fast and it is very nonlinear. Moreover, it goes through a maximum (not easy to identify) and you must start on the right of it to converge.



      Now, consider the transform
      $$g(x)=logleft(sum_i=1^n a_i^x right)- xlog(b)$$ It is almost a straight line ! This means that you can start from almost anywhere and converge fast.



      For illustration purposes, trying with $n=6$, $a_i=p_i$ and $b=p_n+1$ ($p_i$ being the $i^th$ prime number). Being very lazy and using $x_0=0$, the iterates would be
      $$left(
      beginarraycc
      n & x_n \
      0 & 0 \
      1 & 1.607120621 \
      2 & 2.430003204 \
      3 & 2.545372693 \
      4 & 2.546847896 \
      5 & 2.546848123
      endarray
      right)$$
      Using $f(x)$, you must start iterating with $x_0 > 2.14$ to have convergence (big work to know it !). Let us try with $x_0=2.2$ to get as successive iterates



      $$left(
      beginarraycc
      n & x_n \
      0 & 2.200000000 \
      1 & 4.561765400 \
      2 & 4.241750505 \
      3 & 3.929819520 \
      4 & 3.629031502 \
      5 & 3.344096332 \
      6 & 3.082467015 \
      7 & 2.856023930 \
      8 & 2.682559774 \
      9 & 2.581375720 \
      10 & 2.549595979 \
      11 & 2.546866878 \
      12 & 2.546848124 \
      13 & 2.546848123
      endarray
      right)$$



      The last point I would like to mention : even if you have a good guess $x_0$ of the solution, search for a transform $g(x)$ such that $g(x_0),g''(x_0) >0$ (this is Darboux theorem) in order to avoid any overshoot of the solution during the path to it.







      share|cite|improve this answer















      share|cite|improve this answer




      share|cite|improve this answer








      edited Sep 17 at 12:43









      user21820

      43.4k5 gold badges50 silver badges180 bronze badges




      43.4k5 gold badges50 silver badges180 bronze badges










      answered Sep 16 at 5:53









      Claude LeiboviciClaude Leibovici

      141k11 gold badges67 silver badges149 bronze badges




      141k11 gold badges67 silver badges149 bronze badges














      • $begingroup$
        Nice (+1)! In your example, I think the notation $p_i$ means the $i$th prime? So for $n=6$, you'd have $(a_1, a_2, ldots, a_6) = (2, 3, ldots, 13)$, and $b=17$?
        $endgroup$
        – mathmandan
        Sep 16 at 16:52










      • $begingroup$
        @mathmandan. This is correct. Thanks & cheers
        $endgroup$
        – Claude Leibovici
        Sep 16 at 17:18






      • 1




        $begingroup$
        I think you meant to write $$f(x)=fracsum_i=1^n a_i^xb^x$$ (with division rather than subtraction). Otherwise your "transform" to $g(x)$ gives a more-or-less unrelated function; no?)
        $endgroup$
        – ruakh
        Sep 17 at 18:49










      • $begingroup$
        @ruakh. Not at all. What you suggest is better that the original function but is almost hyperbolic while $g(x)$ is almost linear.
        $endgroup$
        – Claude Leibovici
        Sep 18 at 3:23










      • $begingroup$
        @ClaudeLeibovici: Oh, wait, I see. You never mention what equation you were solving, and I made a bad assumption: I thought you were solving the equation $f(x) = y_0$ for some (arbitrary) $y_0$, and I didn't see what was gained by "transforming" that to $g(x) = z_0$ unless there was some relationship between $g$ and $f$ (such as $g(x) = log f(x)$). I now realize that you were solving the equation $f(x) = 0$, so your "transformation" involves converting that to $f_A(x) = f_B(x)$, then to $g_A(x) = g_B(x)$, then to $g(x) = 0$, where $g_A(x) = log f_A(x)$ and $g_B(x) = log f_B(x)$.
        $endgroup$
        – ruakh
        Sep 18 at 4:20
















      • $begingroup$
        Nice (+1)! In your example, I think the notation $p_i$ means the $i$th prime? So for $n=6$, you'd have $(a_1, a_2, ldots, a_6) = (2, 3, ldots, 13)$, and $b=17$?
        $endgroup$
        – mathmandan
        Sep 16 at 16:52










      • $begingroup$
        @mathmandan. This is correct. Thanks & cheers
        $endgroup$
        – Claude Leibovici
        Sep 16 at 17:18






      • 1




        $begingroup$
        I think you meant to write $$f(x)=fracsum_i=1^n a_i^xb^x$$ (with division rather than subtraction). Otherwise your "transform" to $g(x)$ gives a more-or-less unrelated function; no?)
        $endgroup$
        – ruakh
        Sep 17 at 18:49










      • $begingroup$
        @ruakh. Not at all. What you suggest is better that the original function but is almost hyperbolic while $g(x)$ is almost linear.
        $endgroup$
        – Claude Leibovici
        Sep 18 at 3:23










      • $begingroup$
        @ClaudeLeibovici: Oh, wait, I see. You never mention what equation you were solving, and I made a bad assumption: I thought you were solving the equation $f(x) = y_0$ for some (arbitrary) $y_0$, and I didn't see what was gained by "transforming" that to $g(x) = z_0$ unless there was some relationship between $g$ and $f$ (such as $g(x) = log f(x)$). I now realize that you were solving the equation $f(x) = 0$, so your "transformation" involves converting that to $f_A(x) = f_B(x)$, then to $g_A(x) = g_B(x)$, then to $g(x) = 0$, where $g_A(x) = log f_A(x)$ and $g_B(x) = log f_B(x)$.
        $endgroup$
        – ruakh
        Sep 18 at 4:20















      $begingroup$
      Nice (+1)! In your example, I think the notation $p_i$ means the $i$th prime? So for $n=6$, you'd have $(a_1, a_2, ldots, a_6) = (2, 3, ldots, 13)$, and $b=17$?
      $endgroup$
      – mathmandan
      Sep 16 at 16:52




      $begingroup$
      Nice (+1)! In your example, I think the notation $p_i$ means the $i$th prime? So for $n=6$, you'd have $(a_1, a_2, ldots, a_6) = (2, 3, ldots, 13)$, and $b=17$?
      $endgroup$
      – mathmandan
      Sep 16 at 16:52












      $begingroup$
      @mathmandan. This is correct. Thanks & cheers
      $endgroup$
      – Claude Leibovici
      Sep 16 at 17:18




      $begingroup$
      @mathmandan. This is correct. Thanks & cheers
      $endgroup$
      – Claude Leibovici
      Sep 16 at 17:18




      1




      1




      $begingroup$
      I think you meant to write $$f(x)=fracsum_i=1^n a_i^xb^x$$ (with division rather than subtraction). Otherwise your "transform" to $g(x)$ gives a more-or-less unrelated function; no?)
      $endgroup$
      – ruakh
      Sep 17 at 18:49




      $begingroup$
      I think you meant to write $$f(x)=fracsum_i=1^n a_i^xb^x$$ (with division rather than subtraction). Otherwise your "transform" to $g(x)$ gives a more-or-less unrelated function; no?)
      $endgroup$
      – ruakh
      Sep 17 at 18:49












      $begingroup$
      @ruakh. Not at all. What you suggest is better that the original function but is almost hyperbolic while $g(x)$ is almost linear.
      $endgroup$
      – Claude Leibovici
      Sep 18 at 3:23




      $begingroup$
      @ruakh. Not at all. What you suggest is better that the original function but is almost hyperbolic while $g(x)$ is almost linear.
      $endgroup$
      – Claude Leibovici
      Sep 18 at 3:23












      $begingroup$
      @ClaudeLeibovici: Oh, wait, I see. You never mention what equation you were solving, and I made a bad assumption: I thought you were solving the equation $f(x) = y_0$ for some (arbitrary) $y_0$, and I didn't see what was gained by "transforming" that to $g(x) = z_0$ unless there was some relationship between $g$ and $f$ (such as $g(x) = log f(x)$). I now realize that you were solving the equation $f(x) = 0$, so your "transformation" involves converting that to $f_A(x) = f_B(x)$, then to $g_A(x) = g_B(x)$, then to $g(x) = 0$, where $g_A(x) = log f_A(x)$ and $g_B(x) = log f_B(x)$.
      $endgroup$
      – ruakh
      Sep 18 at 4:20




      $begingroup$
      @ClaudeLeibovici: Oh, wait, I see. You never mention what equation you were solving, and I made a bad assumption: I thought you were solving the equation $f(x) = y_0$ for some (arbitrary) $y_0$, and I didn't see what was gained by "transforming" that to $g(x) = z_0$ unless there was some relationship between $g$ and $f$ (such as $g(x) = log f(x)$). I now realize that you were solving the equation $f(x) = 0$, so your "transformation" involves converting that to $f_A(x) = f_B(x)$, then to $g_A(x) = g_B(x)$, then to $g(x) = 0$, where $g_A(x) = log f_A(x)$ and $g_B(x) = log f_B(x)$.
      $endgroup$
      – ruakh
      Sep 18 at 4:20













      10

















      $begingroup$

      Newton's method is very robust once you are near a root - exactly what function you chose to run it on is not of that much consequence; it will converge quickly no matter what. You might as well just use whatever is easiest.



      To be more precise, let $f$ be the function we wish to find a root of and define $g(x)=x-fracf(x)f'(x)$ to be the function performing a single step of Newton's method. First: it should be clear that things go badly if $f'$ and $f$ share a root, so let's assume they don't - and we'll assume that $f$ is smooth near the root, for convenience.



      The idea is that $g$ pulls everything near a root to the root - and to quantify how fast, we can take a look at its Taylor series at a root. Let $x_0$ be a root of $f$. If we differentiate $g$ then simplify at $x_0$ by noting that $f(x_0)=0$, we can see the following:
      $$g(x_0)=x_0$$
      $$g'(x_0)=0$$
      $$g''(x_0)=fracf''(x_0)f'(x_0)$$
      In particular, this tells us that, once we are close enough to $x_0$, if the current amount of error is $e$, one more application of $g$ makes the error something more like $g''(x_0)e^2$ - this is pretty fantastic irrespective of what $g''(x_0)$ is - if you take a small number and square it over and over and over, it gets really small really fast - which is what we want to happen to the error. I suppose if you were really eager, you could try to make $g''(x_0)$ small by choosing $f$ carefully - but realistically, it does not matter, since it tends to be the case that running an additional step of the method completely dwarfs any gains from fine-tuning $f$.




      This said, Newton's method can behave pretty badly on a global scale - even for polynomials it can do really nasty things*. Choosing your function carefully isn't likely to help you much here either, since when functions have really small derivatives, the method is unstable, but when they have rapidly growing derivatives (like near an asymptote), the method converges very very slowly - basically, this isn't a good method to look for a root without having any idea of where one might be, and fine tuning $f$ won't help you there either.



      (*Example: Look up "Newton's method basins of attraction" on Google - you'll get these amazing plots of where the method converges if you run it on polynomials as simple as $x^3-1$ starting at various points in the complex plane. Basically, when it's not near a root, anything can happen)






      share|cite|improve this answer










      $endgroup$










      • 4




        $begingroup$
        There are also functions, for which Newton's method diverges everywhere, also near the root. See e.g. $$y=sqrt lvert xrvert$$ or $$y=operatornamesgn(x)cdotsqrt lvert xrvert$$ for which Newton's method jumps between $a$ and $-a$ for any starting point $ane 0.$
        $endgroup$
        – CiaPan
        Sep 16 at 14:38











      • $begingroup$
        If $f$ has a root of multiplicity strictly greater than one, then Newton's method converges linearly. A method of order three or higher is worthwhile when very high accuracy is sort, i.e., thousands of significant figures.
        $endgroup$
        – Carl Christian
        Sep 16 at 19:48















      10

















      $begingroup$

      Newton's method is very robust once you are near a root - exactly what function you chose to run it on is not of that much consequence; it will converge quickly no matter what. You might as well just use whatever is easiest.



      To be more precise, let $f$ be the function we wish to find a root of and define $g(x)=x-fracf(x)f'(x)$ to be the function performing a single step of Newton's method. First: it should be clear that things go badly if $f'$ and $f$ share a root, so let's assume they don't - and we'll assume that $f$ is smooth near the root, for convenience.



      The idea is that $g$ pulls everything near a root to the root - and to quantify how fast, we can take a look at its Taylor series at a root. Let $x_0$ be a root of $f$. If we differentiate $g$ then simplify at $x_0$ by noting that $f(x_0)=0$, we can see the following:
      $$g(x_0)=x_0$$
      $$g'(x_0)=0$$
      $$g''(x_0)=fracf''(x_0)f'(x_0)$$
      In particular, this tells us that, once we are close enough to $x_0$, if the current amount of error is $e$, one more application of $g$ makes the error something more like $g''(x_0)e^2$ - this is pretty fantastic irrespective of what $g''(x_0)$ is - if you take a small number and square it over and over and over, it gets really small really fast - which is what we want to happen to the error. I suppose if you were really eager, you could try to make $g''(x_0)$ small by choosing $f$ carefully - but realistically, it does not matter, since it tends to be the case that running an additional step of the method completely dwarfs any gains from fine-tuning $f$.




      This said, Newton's method can behave pretty badly on a global scale - even for polynomials it can do really nasty things*. Choosing your function carefully isn't likely to help you much here either, since when functions have really small derivatives, the method is unstable, but when they have rapidly growing derivatives (like near an asymptote), the method converges very very slowly - basically, this isn't a good method to look for a root without having any idea of where one might be, and fine tuning $f$ won't help you there either.



      (*Example: Look up "Newton's method basins of attraction" on Google - you'll get these amazing plots of where the method converges if you run it on polynomials as simple as $x^3-1$ starting at various points in the complex plane. Basically, when it's not near a root, anything can happen)






      share|cite|improve this answer










      $endgroup$










      • 4




        $begingroup$
        There are also functions, for which Newton's method diverges everywhere, also near the root. See e.g. $$y=sqrt lvert xrvert$$ or $$y=operatornamesgn(x)cdotsqrt lvert xrvert$$ for which Newton's method jumps between $a$ and $-a$ for any starting point $ane 0.$
        $endgroup$
        – CiaPan
        Sep 16 at 14:38











      • $begingroup$
        If $f$ has a root of multiplicity strictly greater than one, then Newton's method converges linearly. A method of order three or higher is worthwhile when very high accuracy is sort, i.e., thousands of significant figures.
        $endgroup$
        – Carl Christian
        Sep 16 at 19:48













      10















      10











      10







      $begingroup$

      Newton's method is very robust once you are near a root - exactly what function you chose to run it on is not of that much consequence; it will converge quickly no matter what. You might as well just use whatever is easiest.



      To be more precise, let $f$ be the function we wish to find a root of and define $g(x)=x-fracf(x)f'(x)$ to be the function performing a single step of Newton's method. First: it should be clear that things go badly if $f'$ and $f$ share a root, so let's assume they don't - and we'll assume that $f$ is smooth near the root, for convenience.



      The idea is that $g$ pulls everything near a root to the root - and to quantify how fast, we can take a look at its Taylor series at a root. Let $x_0$ be a root of $f$. If we differentiate $g$ then simplify at $x_0$ by noting that $f(x_0)=0$, we can see the following:
      $$g(x_0)=x_0$$
      $$g'(x_0)=0$$
      $$g''(x_0)=fracf''(x_0)f'(x_0)$$
      In particular, this tells us that, once we are close enough to $x_0$, if the current amount of error is $e$, one more application of $g$ makes the error something more like $g''(x_0)e^2$ - this is pretty fantastic irrespective of what $g''(x_0)$ is - if you take a small number and square it over and over and over, it gets really small really fast - which is what we want to happen to the error. I suppose if you were really eager, you could try to make $g''(x_0)$ small by choosing $f$ carefully - but realistically, it does not matter, since it tends to be the case that running an additional step of the method completely dwarfs any gains from fine-tuning $f$.




      This said, Newton's method can behave pretty badly on a global scale - even for polynomials it can do really nasty things*. Choosing your function carefully isn't likely to help you much here either, since when functions have really small derivatives, the method is unstable, but when they have rapidly growing derivatives (like near an asymptote), the method converges very very slowly - basically, this isn't a good method to look for a root without having any idea of where one might be, and fine tuning $f$ won't help you there either.



      (*Example: Look up "Newton's method basins of attraction" on Google - you'll get these amazing plots of where the method converges if you run it on polynomials as simple as $x^3-1$ starting at various points in the complex plane. Basically, when it's not near a root, anything can happen)






      share|cite|improve this answer










      $endgroup$



      Newton's method is very robust once you are near a root - exactly what function you chose to run it on is not of that much consequence; it will converge quickly no matter what. You might as well just use whatever is easiest.



      To be more precise, let $f$ be the function we wish to find a root of and define $g(x)=x-fracf(x)f'(x)$ to be the function performing a single step of Newton's method. First: it should be clear that things go badly if $f'$ and $f$ share a root, so let's assume they don't - and we'll assume that $f$ is smooth near the root, for convenience.



      The idea is that $g$ pulls everything near a root to the root - and to quantify how fast, we can take a look at its Taylor series at a root. Let $x_0$ be a root of $f$. If we differentiate $g$ then simplify at $x_0$ by noting that $f(x_0)=0$, we can see the following:
      $$g(x_0)=x_0$$
      $$g'(x_0)=0$$
      $$g''(x_0)=fracf''(x_0)f'(x_0)$$
      In particular, this tells us that, once we are close enough to $x_0$, if the current amount of error is $e$, one more application of $g$ makes the error something more like $g''(x_0)e^2$ - this is pretty fantastic irrespective of what $g''(x_0)$ is - if you take a small number and square it over and over and over, it gets really small really fast - which is what we want to happen to the error. I suppose if you were really eager, you could try to make $g''(x_0)$ small by choosing $f$ carefully - but realistically, it does not matter, since it tends to be the case that running an additional step of the method completely dwarfs any gains from fine-tuning $f$.




      This said, Newton's method can behave pretty badly on a global scale - even for polynomials it can do really nasty things*. Choosing your function carefully isn't likely to help you much here either, since when functions have really small derivatives, the method is unstable, but when they have rapidly growing derivatives (like near an asymptote), the method converges very very slowly - basically, this isn't a good method to look for a root without having any idea of where one might be, and fine tuning $f$ won't help you there either.



      (*Example: Look up "Newton's method basins of attraction" on Google - you'll get these amazing plots of where the method converges if you run it on polynomials as simple as $x^3-1$ starting at various points in the complex plane. Basically, when it's not near a root, anything can happen)







      share|cite|improve this answer













      share|cite|improve this answer




      share|cite|improve this answer










      answered Sep 16 at 4:06









      Milo BrandtMilo Brandt

      45.6k5 gold badges83 silver badges149 bronze badges




      45.6k5 gold badges83 silver badges149 bronze badges










      • 4




        $begingroup$
        There are also functions, for which Newton's method diverges everywhere, also near the root. See e.g. $$y=sqrt lvert xrvert$$ or $$y=operatornamesgn(x)cdotsqrt lvert xrvert$$ for which Newton's method jumps between $a$ and $-a$ for any starting point $ane 0.$
        $endgroup$
        – CiaPan
        Sep 16 at 14:38











      • $begingroup$
        If $f$ has a root of multiplicity strictly greater than one, then Newton's method converges linearly. A method of order three or higher is worthwhile when very high accuracy is sort, i.e., thousands of significant figures.
        $endgroup$
        – Carl Christian
        Sep 16 at 19:48












      • 4




        $begingroup$
        There are also functions, for which Newton's method diverges everywhere, also near the root. See e.g. $$y=sqrt lvert xrvert$$ or $$y=operatornamesgn(x)cdotsqrt lvert xrvert$$ for which Newton's method jumps between $a$ and $-a$ for any starting point $ane 0.$
        $endgroup$
        – CiaPan
        Sep 16 at 14:38











      • $begingroup$
        If $f$ has a root of multiplicity strictly greater than one, then Newton's method converges linearly. A method of order three or higher is worthwhile when very high accuracy is sort, i.e., thousands of significant figures.
        $endgroup$
        – Carl Christian
        Sep 16 at 19:48







      4




      4




      $begingroup$
      There are also functions, for which Newton's method diverges everywhere, also near the root. See e.g. $$y=sqrt lvert xrvert$$ or $$y=operatornamesgn(x)cdotsqrt lvert xrvert$$ for which Newton's method jumps between $a$ and $-a$ for any starting point $ane 0.$
      $endgroup$
      – CiaPan
      Sep 16 at 14:38





      $begingroup$
      There are also functions, for which Newton's method diverges everywhere, also near the root. See e.g. $$y=sqrt lvert xrvert$$ or $$y=operatornamesgn(x)cdotsqrt lvert xrvert$$ for which Newton's method jumps between $a$ and $-a$ for any starting point $ane 0.$
      $endgroup$
      – CiaPan
      Sep 16 at 14:38













      $begingroup$
      If $f$ has a root of multiplicity strictly greater than one, then Newton's method converges linearly. A method of order three or higher is worthwhile when very high accuracy is sort, i.e., thousands of significant figures.
      $endgroup$
      – Carl Christian
      Sep 16 at 19:48




      $begingroup$
      If $f$ has a root of multiplicity strictly greater than one, then Newton's method converges linearly. A method of order three or higher is worthwhile when very high accuracy is sort, i.e., thousands of significant figures.
      $endgroup$
      – Carl Christian
      Sep 16 at 19:48











      5

















      $begingroup$

      It depends on the derivative of your function at the zeros of your function.



      The formula suggests that $$x_n+1 = x_n-frac f(x_n)f'(x_n)$$



      Thus you get a better (faster) result if the derivative of your function is higher at the zeros of $f(x)$.



      The process becomes very chaotic if your derivative at a
      staritng point is close to zero.



      As a very interesting case you may consider $$y=sin x$$ and try to find $x=pi$ starting at a point close to $pi/2$



      Without knowing the zeros of $f(x)$ making a decision is not an easy task.






      share|cite|improve this answer












      $endgroup$



















        5

















        $begingroup$

        It depends on the derivative of your function at the zeros of your function.



        The formula suggests that $$x_n+1 = x_n-frac f(x_n)f'(x_n)$$



        Thus you get a better (faster) result if the derivative of your function is higher at the zeros of $f(x)$.



        The process becomes very chaotic if your derivative at a
        staritng point is close to zero.



        As a very interesting case you may consider $$y=sin x$$ and try to find $x=pi$ starting at a point close to $pi/2$



        Without knowing the zeros of $f(x)$ making a decision is not an easy task.






        share|cite|improve this answer












        $endgroup$

















          5















          5











          5







          $begingroup$

          It depends on the derivative of your function at the zeros of your function.



          The formula suggests that $$x_n+1 = x_n-frac f(x_n)f'(x_n)$$



          Thus you get a better (faster) result if the derivative of your function is higher at the zeros of $f(x)$.



          The process becomes very chaotic if your derivative at a
          staritng point is close to zero.



          As a very interesting case you may consider $$y=sin x$$ and try to find $x=pi$ starting at a point close to $pi/2$



          Without knowing the zeros of $f(x)$ making a decision is not an easy task.






          share|cite|improve this answer












          $endgroup$



          It depends on the derivative of your function at the zeros of your function.



          The formula suggests that $$x_n+1 = x_n-frac f(x_n)f'(x_n)$$



          Thus you get a better (faster) result if the derivative of your function is higher at the zeros of $f(x)$.



          The process becomes very chaotic if your derivative at a
          staritng point is close to zero.



          As a very interesting case you may consider $$y=sin x$$ and try to find $x=pi$ starting at a point close to $pi/2$



          Without knowing the zeros of $f(x)$ making a decision is not an easy task.







          share|cite|improve this answer















          share|cite|improve this answer




          share|cite|improve this answer








          edited Sep 17 at 16:11

























          answered Sep 16 at 3:44









          Mohammad Riazi-KermaniMohammad Riazi-Kermani

          60.3k4 gold badges29 silver badges76 bronze badges




          60.3k4 gold badges29 silver badges76 bronze badges
























              2

















              $begingroup$

              It is not difficult to show that there are just two real solutions, since




              • $x^2+2x$ is decreasing from $+infty$ to $-1$ on $(-infty,-1]$, while $frac1x^2+1$ is increasing from $0$ to $frac12$;

              • over $[-1,0]$ we have $x^2+2xleq 0$ but $frac1x^2+1>0$;


              • $x^2+2x$ is increasing from $0$ to $+infty$ on $mathbbR^+$, while $frac1x^2+1$ is decreasing from $1$ to $0$.

              Let's say we are interested in finding the positive root, which lies in $I=left[0,frac12right]$.

              Over such interval $a(x)=x^2+2x$ is increasing and convex, $b(x)=frac1x^2+1$ is decreasing and concave.

              This suggests a tweak of the Newton-Raphson iteration:



              enter image description here



              We are interested in the intersection between the blue and orange curves, which lies on the left of the intersection between the sienna and the purple tangent lines. By solving
              $$ 3x-frac14 = frac2825-frac1625x $$
              we find that $frac137364$ is already a good approximation of the solution in $[0,1]$. Newton's method (applied to $a(x)-b(x)$) with such starting point is granted to converge quadratically (and in a monotonic way) to the actual solution $0.3708102352ldots$






              share|cite|improve this answer










              $endgroup$



















                2

















                $begingroup$

                It is not difficult to show that there are just two real solutions, since




                • $x^2+2x$ is decreasing from $+infty$ to $-1$ on $(-infty,-1]$, while $frac1x^2+1$ is increasing from $0$ to $frac12$;

                • over $[-1,0]$ we have $x^2+2xleq 0$ but $frac1x^2+1>0$;


                • $x^2+2x$ is increasing from $0$ to $+infty$ on $mathbbR^+$, while $frac1x^2+1$ is decreasing from $1$ to $0$.

                Let's say we are interested in finding the positive root, which lies in $I=left[0,frac12right]$.

                Over such interval $a(x)=x^2+2x$ is increasing and convex, $b(x)=frac1x^2+1$ is decreasing and concave.

                This suggests a tweak of the Newton-Raphson iteration:



                enter image description here



                We are interested in the intersection between the blue and orange curves, which lies on the left of the intersection between the sienna and the purple tangent lines. By solving
                $$ 3x-frac14 = frac2825-frac1625x $$
                we find that $frac137364$ is already a good approximation of the solution in $[0,1]$. Newton's method (applied to $a(x)-b(x)$) with such starting point is granted to converge quadratically (and in a monotonic way) to the actual solution $0.3708102352ldots$






                share|cite|improve this answer










                $endgroup$

















                  2















                  2











                  2







                  $begingroup$

                  It is not difficult to show that there are just two real solutions, since




                  • $x^2+2x$ is decreasing from $+infty$ to $-1$ on $(-infty,-1]$, while $frac1x^2+1$ is increasing from $0$ to $frac12$;

                  • over $[-1,0]$ we have $x^2+2xleq 0$ but $frac1x^2+1>0$;


                  • $x^2+2x$ is increasing from $0$ to $+infty$ on $mathbbR^+$, while $frac1x^2+1$ is decreasing from $1$ to $0$.

                  Let's say we are interested in finding the positive root, which lies in $I=left[0,frac12right]$.

                  Over such interval $a(x)=x^2+2x$ is increasing and convex, $b(x)=frac1x^2+1$ is decreasing and concave.

                  This suggests a tweak of the Newton-Raphson iteration:



                  enter image description here



                  We are interested in the intersection between the blue and orange curves, which lies on the left of the intersection between the sienna and the purple tangent lines. By solving
                  $$ 3x-frac14 = frac2825-frac1625x $$
                  we find that $frac137364$ is already a good approximation of the solution in $[0,1]$. Newton's method (applied to $a(x)-b(x)$) with such starting point is granted to converge quadratically (and in a monotonic way) to the actual solution $0.3708102352ldots$






                  share|cite|improve this answer










                  $endgroup$



                  It is not difficult to show that there are just two real solutions, since




                  • $x^2+2x$ is decreasing from $+infty$ to $-1$ on $(-infty,-1]$, while $frac1x^2+1$ is increasing from $0$ to $frac12$;

                  • over $[-1,0]$ we have $x^2+2xleq 0$ but $frac1x^2+1>0$;


                  • $x^2+2x$ is increasing from $0$ to $+infty$ on $mathbbR^+$, while $frac1x^2+1$ is decreasing from $1$ to $0$.

                  Let's say we are interested in finding the positive root, which lies in $I=left[0,frac12right]$.

                  Over such interval $a(x)=x^2+2x$ is increasing and convex, $b(x)=frac1x^2+1$ is decreasing and concave.

                  This suggests a tweak of the Newton-Raphson iteration:



                  enter image description here



                  We are interested in the intersection between the blue and orange curves, which lies on the left of the intersection between the sienna and the purple tangent lines. By solving
                  $$ 3x-frac14 = frac2825-frac1625x $$
                  we find that $frac137364$ is already a good approximation of the solution in $[0,1]$. Newton's method (applied to $a(x)-b(x)$) with such starting point is granted to converge quadratically (and in a monotonic way) to the actual solution $0.3708102352ldots$







                  share|cite|improve this answer













                  share|cite|improve this answer




                  share|cite|improve this answer










                  answered Sep 16 at 15:19









                  Jack D'AurizioJack D'Aurizio

                  307k34 gold badges301 silver badges701 bronze badges




                  307k34 gold badges301 silver badges701 bronze badges































                      draft saved

                      draft discarded















































                      Thanks for contributing an answer to Mathematics Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3358137%2fhow-do-you-determine-which-representation-of-a-function-to-use-for-newtons-meth%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown









                      Popular posts from this blog

                      Tamil (spriik) Luke uk diar | Nawigatjuun

                      Align equal signs while including text over equalitiesAMS align: left aligned text/math plus multicolumn alignmentMultiple alignmentsAligning equations in multiple placesNumbering and aligning an equation with multiple columnsHow to align one equation with another multline equationUsing \ in environments inside the begintabularxNumber equations and preserving alignment of equal signsHow can I align equations to the left and to the right?Double equation alignment problem within align enviromentAligned within align: Why are they right-aligned?

                      Training a classifier when some of the features are unknownWhy does Gradient Boosting regression predict negative values when there are no negative y-values in my training set?How to improve an existing (trained) classifier?What is effect when I set up some self defined predisctor variables?Why Matlab neural network classification returns decimal values on prediction dataset?Fitting and transforming text data in training, testing, and validation setsHow to quantify the performance of the classifier (multi-class SVM) using the test data?How do I control for some patients providing multiple samples in my training data?Training and Test setTraining a convolutional neural network for image denoising in MatlabShouldn't an autoencoder with #(neurons in hidden layer) = #(neurons in input layer) be “perfect”?