Bsrgvty

Question

The Machine Learning community has largely benefited from modern GPUs and several large companies are investing in new dedicated hardware.

Unfortunately, academic and commercial mathematical optimization solvers still lack support for GPUs. They have support for distributed or shared memory computing environment (e.g., see the Ubiquity Generator framework from ZIB), but it looks that GPUs raise different technical challenges for (discrete) math optimizers.

Here my two questions:

Which GPU, if any, should I get for mathematical optimization?

Does there exist any mathematical optimization software that can fully exploit multiple modern GPUs?

Geoffrey De SmetGeoffrey De Smet 1,83420 bronze badges · Accepted Answer · 2019-07-17 08:27:42Z

I've not seen any efficient use from GPU's for metaheuristics - only experiments that proved their inefficiency for these algorithms. So not the right tool for the job, apparently. Maybe there's a undiscovered technique to make them work efficiently. (I have seen/build efficient use of multiple CPU cores for metaheuristics, even on Local Search with incremental fitness calculation.)

Let me define efficient use as: in a apples to apples comparison on multiple non-trivial datasets, running for the same amount of time (ideally a few minutes), the GPU strategy isn't dominated by the non-GPU strategy.

score 13 · Accepted Answer · 2019-07-17 19:22:49Z

If you problem is continuous I would say that it might be beneficial. For problems that involve discrete variables I've not seen anything that does benefit from the usage of a GPU.

GPUs aid problem solving if the underlying problem has a structure that can exploit the massive parallel computation structure of the graphical processing unit. Calculations which involve large matrices are a good example. Deep learning is able to make very good use of GPUs as their computations can (roughly) be written as matrix vector operations. There exist specialized BLAS and LAPACK instructions for GPUs (e.g. https://developer.nvidia.com/cublas), so algorithms making use of these are likely to see a speedup.

On the other hand algorithms which exhibit inherent "branches" (i.e. decisions) within their flow cannot use the parallel computation capabilities of GPUs because with each branch the problem changes a bit.

So to answer your question(s):

If your problem involves discrete decisions: You won't get any benefit from having a potent GPU.

If your problem involves only continuous variables: It might be beneficial - but I've not seen any solver that claims to specifically exploit GPUs.

I'm not aware of any.

Kevin Dalmeijer 4,0721 gold badge10 silver badges40 bronze badges · Accepted Answer · 2019-07-17 16:02:48Z

The first algorithm coming to mind that can benefit from GPUs is the Interior-Point Method (IPM), at its heart is the resolution of a linear system. See references:

GPU Acceleration of the Matrix-Free Interior Point Method

Cholesky Decomposition and Linear Programming
on a GPU

score 9 · Accepted Answer · 2019-07-19 03:55:52Z

A lot depends on what kinds of computations you are doing. The subject of this group is "Operations Research", but that surely includes a range of computational work including discrete event simulation, machine learning, linear and nonlinear programming, discrete optimization, etc. There's no one answer applicable to all of those kinds of problems.

For linear and nonlinear programming, one important issue is that nearly all of these computations are typically performed in double precision rather than single precision.

NVIDIA has adopted a strategy in which different models of their GPU's are optimized for different uses and priced differently. Except for the Tesla line of GPU's aimed at high performance computing applications, most of NVIDIA's GPU's are configured so that double precision is much (e.g. 32 times) slower than single precision. This means that these other models are poorly suited for double precision floating point computations.

In contrast, most machine learning computing can be done in single precision (or even half precision.) The inexpensive consumer oriented GPU's sold by NVIDIA perform incredibly well on these kinds of computations.

Another important issue in linear algebra computations is whether the matrices that you're working with are sparse (have lots of zero entries) or dense (all or nearly all entries are nonzero.) GPU's excel at dense matrix linear algebra but don't perform quite so well with sparse matrices. Nearly all linear programming models have sparse constraint matrices and this is exploited by both simplex and interior point solvers. Thus GPU's have not been very successful in linear programming (you'll notice that neither CPLEX nor GuRoBi work with GPU's.) The situation with nonlinear programming is somewhat more varied.

Thanks Brian! You're right, even in my limited experience with GPU, dealing with single or double precision makes a huge difference with GPU speed. — Jul 18 at 16:13

score 6 · Accepted Answer · 2019-07-22 18:41:28Z

Which GPU, if any, should I get for mathematical optimization?

In the case of commercially available software, where no source code is available, you are stuck using the GPU that is better supported by the applications you intend to run.

AmgX, cuSOLVER and nvGRAPH all require Nvidia GPUs, and offer supporting articles on their blog.

Cusp is a library for sparse linear algebra and graph computations based on Thrust. Cusp provides a flexible, high-level interface for manipulating sparse matrices and solving sparse linear systems. It is written to use CUDA.

Hyperlearn requires CUDA. Offers GPU acceleration of:
- Matrix Completion algorithms - Non Negative Least Squares, NNMF
- Batch Similarity Latent Dirichelt Allocation (BS-LDA)
- Correlation Regression
- Feasible Generalized Least Squares FGLS
- Outlier Tolerant Regression
- Multidimensional Spline Regression
- Generalized MICE (any model drop in replacement)
- Using Uber's Pyro for Bayesian Deep Learning

Matlab only supports GPU acceleration on Nvidia GPUs when using the Parallel Computing Toolbox, otherwise any graphics card supporting OpenGL 3.3 with 1GB GPU memory is recommended.

Pagmo2 supports both Nvidia and AMD GPU acceleration. Pagmo (C++) or pygmo (Python) is a scientific library for massively parallel optimization. It is built around the idea of providing a unified interface to optimization algorithms and to optimization problems and to make their deployment in massively parallel environments easy. A short list of some papers from the European Space Agency where pagmo was utilized.

Python has a number of libraries that support CUDA, but not as much support for AMD GPUs and OpenCL, some libraries such as Numba support both GPU manufacturers but Nvidia certainly blogs about it more.

scikit-CUDA provides Python interfaces to many of the functions in the CUDA device/runtime, cuBLAS, cuFFT, and cuSOLVER libraries distributed as part of NVIDIA’s CUDA Programming Toolkit, as well as interfaces to select functions in the CULA Dense Toolkit. Both low-level wrapper functions similar to their C counterparts and high-level functions comparable to those in NumPy and Scipy are provided.

SuiteSparse libraries for sparse matrix operations on Nvidia GPUs.

Theano combines aspects of a computer algebra system (CAS) with aspects of an optimizing compiler. It can also generate customized C code for many mathematical operations. This combination of CAS with optimizing compilation is particularly useful for tasks in which complicated mathematical expressions are evaluated repeatedly and evaluation speed is critical. For situations where many different expressions are each evaluated once Theano can minimize the amount of compilation/analysis overhead, but still provide symbolic features such as automatic differentiation.It requires CUDA.

ViennaCL provides CUDA, OpenCL and OpenMP computing backends. It enables simple, high-level access to the vast computing resources available on parallel architectures such as GPUs and is primarily focused on common sparse and dense linear algebra operations (BLAS levels 1, 2 and 3). It also provides iterative solvers with optional preconditioners for large systems of equations.

A good website for benchmarks for FP64, FP32 and FP16 is Lambda Labs, one article in particular ("Deep Learning GPU Benchmarks - Tesla V100 vs RTX 2080 Ti vs GTX 1080 Ti vs Titan V") offers a great bottom line on what you get, and how much it costs. Don't let the DL slant discourage you, DL can be used for optimization (fast results, not guaranteed to be absolutely optimal) either as a starting point for your variables or a final result. My purpose of mentioning this article is to cite these quotes:

"Results summary

As of February 8, 2019, the NVIDIA RTX 2080 Ti is the best GPU for deep learning research on a single GPU system running TensorFlow. A typical single GPU system with this GPU will be:

37% faster than the 1080 Ti with FP32, 62% faster with FP16, and 25% more expensive.

35% faster than the 2080 with FP32, 47% faster with FP16, and 25% more expensive.

96% as fast as the Titan V with FP32, 3% faster with FP16, and ~1/2 of the cost.

80% as fast as the Tesla V100 with FP32, 82% as fast with FP16, and ~1/5 of the cost.

Note that all experiments utilized Tensor Cores when available and are priced out on a complete single GPU system cost. As a system builder and AI research company, we're trying to make benchmarks that are scientific, reproducible, correlate with real world training scenarios, and have accurate prices. So, we've decided to make the spreadsheet that generated our graphs and (performance / $) tables public."

...

"2080 Ti vs V100 - is the 2080 Ti really that fast?

How can the 2080 Ti be 80% as fast as the Tesla V100, but only 1/8th of the price? The answer is simple: NVIDIA wants to segment the market so that those with high willingness to pay (hyper scalers) only buy their TESLA line of cards which retail for ~$9,800. The RTX and GTX series of cards still offers the best performance per dollar.

If you're not AWS, Azure, or Google Cloud then you're probably much better off buying the 2080 Ti. There are, however, a few key use cases where the V100s can come in handy:

If you need FP64 compute. If you're doing Computational Fluid Dynamics, n-body simulation, or other work that requires high numerical precision (FP64), then you'll need to buy the Titan V or V100s. If you're not sure if you need FP64, you don't. You would know.

If you absolutely need 32 GB of memory because your model size won't fit into 11 GB of memory with a batch size of 1. If you are creating your own model architecture and it simply can't fit even when you bring the batch size lower, the V100 could make sense. However, this is a pretty rare edge case. Fewer than 5% of our customers are using custom models. Most use something like ResNet, VGG, Inception, SSD, or Yolo.

So. You're still wondering. Why would anybody buy the V100? It comes down to marketing.

2080 Ti is a Porsche 911, the V100 is a Bugatti Veyron

The V100 is a bit like a Bugatti Veyron. It's one of the fastest street legal cars in the world, ridiculously expensive, and, if you have to ask how much the insurance and maintenance is, you can't afford it. The RTX 2080 Ti, on the other hand, is like a Porsche 911. It's very fast, handles well, expensive but not ostentatious, and with the same amount of money you'd pay for the Bugatti, you can buy the Porsche, a home, a BMW 7-series, send three kids to college, and have money left over for retirement. [Rob's note: costs are different for him compared to my calculations.]

And if you think I'm going overboard with the Porsche analogy, you can buy a DGX-1 8x V100 for $120,000 or a Lambda Blade 8x 2080 Ti for $28,000 and have enough left over for a real Porsche 911. Your pick.".

Thus, you want to pick a GPU manufacturer that provides better benchmarks for the programs you want to run, unless you have the source code and possess some GPU tweaking skills.

The best deal is probably the AMD Radeon VII with it's FP64 rate of 1/4 for only U$700, and even though it's new it's also being discontinued; so there may be some price drops coming. Unfortunately while the hardware is probably a better deal for many people the amount of software available that can wrestle the performance out of it is far fewer and not as developed as what's available for an Nvidia card.

Does there exist any mathematical optimization software that can fully exploit multiple modern GPUs?

All of the above links list software that benefits from more GPU cores, even if they are spread across multiple cards, multiple machines or even cloud GPU computing in some cases.

An often quoted article about selecting a GPU and using multiple GPUs is: "Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning" (2019-04-03) by Tim Dettmers. While it's focus is on Deep Learning it provides an excellent explanation of the difficulties and performance increase to be expected when using multiple GPUs.

He also says something about the usage of GPUs, in general (where applicable), but again it's in reference to DL (though still applicable to OR optimization):

"Overall I think I still cannot give a clear recommendation for AMD GPUs for ordinary users that just want their GPUs to work smoothly. More experienced users should have fewer problems and by supporting AMD GPUs and ROCm/HIP developers they contribute to the combat against the monopoly position of NVIDIA as this will greatly benefit everyone in the long-term. If you are a GPU developer and want to make important contributions to GPU computing, then an AMD GPU might be the best way to make a good impact over the long-term. For everyone else, NVIDIA GPUs might be the safer choice.".

Articles about using GPUs for Operations Research:

GPU Computing Applied to Linear and Mixed Integer Programming

GPU computing in discrete optimization. Part II: Survey focused on routing problems

gpuMF: a framework for parallel hybrid metaheuristics on GPU with application to the minimisation of harmonics in multilevel inverters

^{^{I'll return later to expand this answer.}}

@StefanoGualandi - You are most welcome. I was just on my way back with an addition when I was diverted, but you can check back tomorrow. BTW: How many bits are you planning on using most often (FP64, bfloat, INT2-8, etc.), how much memory (can you afford, over paying for performance). And, what is your overall budget - do you want 4x$500 cards or 2x$3000 cards, for example. Do you want bang for your $, or simply very fast but not most expensive? Please add to your question any additional info you wish to offer and I'll try to address a specific situation. — Jul 22 at 5:45
the last two papers you linked are interesting, but I need to find out the time to read them carefully. — Jul 22 at 8:59
indeed the last two papers look interesting. I browsed through the first one which is a meta analysis of several other papers that use GPUs for OR problems. The paper is somewhat old (from 2016 with many cited papers from early 201Xs) but even at that time the results at least for exact approaches and simplex look underwhelming: "The authors use randomly generated instances of ATSP with up to 16 cities", "...with 8,000 variables and 2,700 constraints". Often it is not even clear what they compare against. — Jul 22 at 10:49

Geoffrey De SmetGeoffrey De Smet 1,83420 bronze badges · Accepted Answer · 2019-07-17 08:27:42Z

I've not seen any efficient use from GPU's for metaheuristics - only experiments that proved their inefficiency for these algorithms. So not the right tool for the job, apparently. Maybe there's a undiscovered technique to make them work efficiently. (I have seen/build efficient use of multiple CPU cores for metaheuristics, even on Local Search with incremental fitness calculation.)

Let me define efficient use as: in a apples to apples comparison on multiple non-trivial datasets, running for the same amount of time (ideally a few minutes), the GPU strategy isn't dominated by the non-GPU strategy.

score 13 · Accepted Answer · 2019-07-17 19:22:49Z

If you problem is continuous I would say that it might be beneficial. For problems that involve discrete variables I've not seen anything that does benefit from the usage of a GPU.

GPUs aid problem solving if the underlying problem has a structure that can exploit the massive parallel computation structure of the graphical processing unit. Calculations which involve large matrices are a good example. Deep learning is able to make very good use of GPUs as their computations can (roughly) be written as matrix vector operations. There exist specialized BLAS and LAPACK instructions for GPUs (e.g. https://developer.nvidia.com/cublas), so algorithms making use of these are likely to see a speedup.

On the other hand algorithms which exhibit inherent "branches" (i.e. decisions) within their flow cannot use the parallel computation capabilities of GPUs because with each branch the problem changes a bit.

So to answer your question(s):

If your problem involves discrete decisions: You won't get any benefit from having a potent GPU.

If your problem involves only continuous variables: It might be beneficial - but I've not seen any solver that claims to specifically exploit GPUs.

I'm not aware of any.

Kevin Dalmeijer 4,0721 gold badge10 silver badges40 bronze badges · Accepted Answer · 2019-07-17 16:02:48Z

The first algorithm coming to mind that can benefit from GPUs is the Interior-Point Method (IPM), at its heart is the resolution of a linear system. See references:

GPU Acceleration of the Matrix-Free Interior Point Method

Cholesky Decomposition and Linear Programming
on a GPU

score 9 · Accepted Answer · 2019-07-19 03:55:52Z

A lot depends on what kinds of computations you are doing. The subject of this group is "Operations Research", but that surely includes a range of computational work including discrete event simulation, machine learning, linear and nonlinear programming, discrete optimization, etc. There's no one answer applicable to all of those kinds of problems.

For linear and nonlinear programming, one important issue is that nearly all of these computations are typically performed in double precision rather than single precision.

NVIDIA has adopted a strategy in which different models of their GPU's are optimized for different uses and priced differently. Except for the Tesla line of GPU's aimed at high performance computing applications, most of NVIDIA's GPU's are configured so that double precision is much (e.g. 32 times) slower than single precision. This means that these other models are poorly suited for double precision floating point computations.

In contrast, most machine learning computing can be done in single precision (or even half precision.) The inexpensive consumer oriented GPU's sold by NVIDIA perform incredibly well on these kinds of computations.

Another important issue in linear algebra computations is whether the matrices that you're working with are sparse (have lots of zero entries) or dense (all or nearly all entries are nonzero.) GPU's excel at dense matrix linear algebra but don't perform quite so well with sparse matrices. Nearly all linear programming models have sparse constraint matrices and this is exploited by both simplex and interior point solvers. Thus GPU's have not been very successful in linear programming (you'll notice that neither CPLEX nor GuRoBi work with GPU's.) The situation with nonlinear programming is somewhat more varied.

Thanks Brian! You're right, even in my limited experience with GPU, dealing with single or double precision makes a huge difference with GPU speed. — Jul 18 at 16:13

score 6 · Accepted Answer · 2019-07-22 18:41:28Z

Which GPU, if any, should I get for mathematical optimization?

In the case of commercially available software, where no source code is available, you are stuck using the GPU that is better supported by the applications you intend to run.

AmgX, cuSOLVER and nvGRAPH all require Nvidia GPUs, and offer supporting articles on their blog.

Cusp is a library for sparse linear algebra and graph computations based on Thrust. Cusp provides a flexible, high-level interface for manipulating sparse matrices and solving sparse linear systems. It is written to use CUDA.

Hyperlearn requires CUDA. Offers GPU acceleration of:
- Matrix Completion algorithms - Non Negative Least Squares, NNMF
- Batch Similarity Latent Dirichelt Allocation (BS-LDA)
- Correlation Regression
- Feasible Generalized Least Squares FGLS
- Outlier Tolerant Regression
- Multidimensional Spline Regression
- Generalized MICE (any model drop in replacement)
- Using Uber's Pyro for Bayesian Deep Learning

Matlab only supports GPU acceleration on Nvidia GPUs when using the Parallel Computing Toolbox, otherwise any graphics card supporting OpenGL 3.3 with 1GB GPU memory is recommended.

Pagmo2 supports both Nvidia and AMD GPU acceleration. Pagmo (C++) or pygmo (Python) is a scientific library for massively parallel optimization. It is built around the idea of providing a unified interface to optimization algorithms and to optimization problems and to make their deployment in massively parallel environments easy. A short list of some papers from the European Space Agency where pagmo was utilized.

Python has a number of libraries that support CUDA, but not as much support for AMD GPUs and OpenCL, some libraries such as Numba support both GPU manufacturers but Nvidia certainly blogs about it more.

scikit-CUDA provides Python interfaces to many of the functions in the CUDA device/runtime, cuBLAS, cuFFT, and cuSOLVER libraries distributed as part of NVIDIA’s CUDA Programming Toolkit, as well as interfaces to select functions in the CULA Dense Toolkit. Both low-level wrapper functions similar to their C counterparts and high-level functions comparable to those in NumPy and Scipy are provided.

SuiteSparse libraries for sparse matrix operations on Nvidia GPUs.

Theano combines aspects of a computer algebra system (CAS) with aspects of an optimizing compiler. It can also generate customized C code for many mathematical operations. This combination of CAS with optimizing compilation is particularly useful for tasks in which complicated mathematical expressions are evaluated repeatedly and evaluation speed is critical. For situations where many different expressions are each evaluated once Theano can minimize the amount of compilation/analysis overhead, but still provide symbolic features such as automatic differentiation.It requires CUDA.

ViennaCL provides CUDA, OpenCL and OpenMP computing backends. It enables simple, high-level access to the vast computing resources available on parallel architectures such as GPUs and is primarily focused on common sparse and dense linear algebra operations (BLAS levels 1, 2 and 3). It also provides iterative solvers with optional preconditioners for large systems of equations.

A good website for benchmarks for FP64, FP32 and FP16 is Lambda Labs, one article in particular ("Deep Learning GPU Benchmarks - Tesla V100 vs RTX 2080 Ti vs GTX 1080 Ti vs Titan V") offers a great bottom line on what you get, and how much it costs. Don't let the DL slant discourage you, DL can be used for optimization (fast results, not guaranteed to be absolutely optimal) either as a starting point for your variables or a final result. My purpose of mentioning this article is to cite these quotes:

"Results summary

As of February 8, 2019, the NVIDIA RTX 2080 Ti is the best GPU for deep learning research on a single GPU system running TensorFlow. A typical single GPU system with this GPU will be:

37% faster than the 1080 Ti with FP32, 62% faster with FP16, and 25% more expensive.

35% faster than the 2080 with FP32, 47% faster with FP16, and 25% more expensive.

96% as fast as the Titan V with FP32, 3% faster with FP16, and ~1/2 of the cost.

80% as fast as the Tesla V100 with FP32, 82% as fast with FP16, and ~1/5 of the cost.

Note that all experiments utilized Tensor Cores when available and are priced out on a complete single GPU system cost. As a system builder and AI research company, we're trying to make benchmarks that are scientific, reproducible, correlate with real world training scenarios, and have accurate prices. So, we've decided to make the spreadsheet that generated our graphs and (performance / $) tables public."

...

"2080 Ti vs V100 - is the 2080 Ti really that fast?

How can the 2080 Ti be 80% as fast as the Tesla V100, but only 1/8th of the price? The answer is simple: NVIDIA wants to segment the market so that those with high willingness to pay (hyper scalers) only buy their TESLA line of cards which retail for ~$9,800. The RTX and GTX series of cards still offers the best performance per dollar.

If you're not AWS, Azure, or Google Cloud then you're probably much better off buying the 2080 Ti. There are, however, a few key use cases where the V100s can come in handy:

If you need FP64 compute. If you're doing Computational Fluid Dynamics, n-body simulation, or other work that requires high numerical precision (FP64), then you'll need to buy the Titan V or V100s. If you're not sure if you need FP64, you don't. You would know.

If you absolutely need 32 GB of memory because your model size won't fit into 11 GB of memory with a batch size of 1. If you are creating your own model architecture and it simply can't fit even when you bring the batch size lower, the V100 could make sense. However, this is a pretty rare edge case. Fewer than 5% of our customers are using custom models. Most use something like ResNet, VGG, Inception, SSD, or Yolo.

So. You're still wondering. Why would anybody buy the V100? It comes down to marketing.

2080 Ti is a Porsche 911, the V100 is a Bugatti Veyron

The V100 is a bit like a Bugatti Veyron. It's one of the fastest street legal cars in the world, ridiculously expensive, and, if you have to ask how much the insurance and maintenance is, you can't afford it. The RTX 2080 Ti, on the other hand, is like a Porsche 911. It's very fast, handles well, expensive but not ostentatious, and with the same amount of money you'd pay for the Bugatti, you can buy the Porsche, a home, a BMW 7-series, send three kids to college, and have money left over for retirement. [Rob's note: costs are different for him compared to my calculations.]

And if you think I'm going overboard with the Porsche analogy, you can buy a DGX-1 8x V100 for $120,000 or a Lambda Blade 8x 2080 Ti for $28,000 and have enough left over for a real Porsche 911. Your pick.".

Thus, you want to pick a GPU manufacturer that provides better benchmarks for the programs you want to run, unless you have the source code and possess some GPU tweaking skills.

The best deal is probably the AMD Radeon VII with it's FP64 rate of 1/4 for only U$700, and even though it's new it's also being discontinued; so there may be some price drops coming. Unfortunately while the hardware is probably a better deal for many people the amount of software available that can wrestle the performance out of it is far fewer and not as developed as what's available for an Nvidia card.

Does there exist any mathematical optimization software that can fully exploit multiple modern GPUs?

All of the above links list software that benefits from more GPU cores, even if they are spread across multiple cards, multiple machines or even cloud GPU computing in some cases.

An often quoted article about selecting a GPU and using multiple GPUs is: "Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning" (2019-04-03) by Tim Dettmers. While it's focus is on Deep Learning it provides an excellent explanation of the difficulties and performance increase to be expected when using multiple GPUs.

He also says something about the usage of GPUs, in general (where applicable), but again it's in reference to DL (though still applicable to OR optimization):

"Overall I think I still cannot give a clear recommendation for AMD GPUs for ordinary users that just want their GPUs to work smoothly. More experienced users should have fewer problems and by supporting AMD GPUs and ROCm/HIP developers they contribute to the combat against the monopoly position of NVIDIA as this will greatly benefit everyone in the long-term. If you are a GPU developer and want to make important contributions to GPU computing, then an AMD GPU might be the best way to make a good impact over the long-term. For everyone else, NVIDIA GPUs might be the safer choice.".

Articles about using GPUs for Operations Research:

GPU Computing Applied to Linear and Mixed Integer Programming

GPU computing in discrete optimization. Part II: Survey focused on routing problems

gpuMF: a framework for parallel hybrid metaheuristics on GPU with application to the minimisation of harmonics in multilevel inverters

^{^{I'll return later to expand this answer.}}

@StefanoGualandi - You are most welcome. I was just on my way back with an addition when I was diverted, but you can check back tomorrow. BTW: How many bits are you planning on using most often (FP64, bfloat, INT2-8, etc.), how much memory (can you afford, over paying for performance). And, what is your overall budget - do you want 4x$500 cards or 2x$3000 cards, for example. Do you want bang for your $, or simply very fast but not most expensive? Please add to your question any additional info you wish to offer and I'll try to address a specific situation. — Jul 22 at 5:45
the last two papers you linked are interesting, but I need to find out the time to read them carefully. — Jul 22 at 8:59
indeed the last two papers look interesting. I browsed through the first one which is a meta analysis of several other papers that use GPUs for OR problems. The paper is somewhat old (from 2016 with many cited papers from early 201Xs) but even at that time the results at least for exact approaches and simplex look underwhelming: "The authors use randomly generated instances of ATSP with up to 16 cities", "...with 8,000 variables and 2,700 constraints". Often it is not even clear what they compare against. — Jul 22 at 10:49

搜尋此網誌

Bsrgvty

5 Answers
5

Your Answer

Post as a guest

5 Answers
5

5 Answers
5

Post as a guest

Popular posts from this blog

Tamil (spriik) Luke uk diar | Nawigatjuun

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

5 Answers 5

5 Answers 5

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tamil (spriik) Luke uk diar | Nawigatjuun

5 Answers
5

5 Answers
5

5 Answers
5