Tech

All people play AIGC, is your computing power strong enough?

  Ever since ChatGPT became popular all over the world overnight, AI drawing has been out of control and has become the focus of public attention. Many companies have even begun to use AI to produce drawings to replace the work of some designers, and there is a tendency that “if you can’t use AI, you will be eliminated by the times”. Of course, although there is still a lot of room for improvement in the details of real objects produced by AI at present, it cannot be denied that after the computing power is greatly upgraded and the model is greatly evolved in the future, AI is fully capable of part of the design work. Therefore, friends who are interested in AI applications can now act and start learning. However, if you want to do a good job, you must first sharpen your tools, so what kind of computer should you use to play AI to draw pictures?

Local AI drawing, efficient tools are indispensable

  We know that currently AI maps are divided into two computing methods: cloud and local. Simply put, online methods (such as the famous Midjourney) are less difficult to get started, basically do not require local computer hardware, and the lower limit of image quality is high, so it is easy. A relatively good picture can be produced, but many prompt words are blocked, so the degree of freedom is relatively low. In addition, generally speaking, the advanced services of AI cloud map output with better results require payment, and domestic access may also be subject to certain restrictions.
  The local computing methods, such as StableDiffusion, are difficult to learn, but the scalability and the degree of freedom of prompt words are extremely high, so the upper limit of the output quality is also extremely high. The most important thing is that StableDiffusion itself and the large number of available plug-ins and models are free, and it is more convenient to train the model by yourself, which is more suitable for players to try.

  Of course, since it is local computing, if high efficiency is required, the hardware performance of the computer must not be low. StableDiffusion’s local AI map mainly relies on GPU for calculation (CPU is also fine, but the efficiency of parallel computing is obviously far lower than that of GPU, which is several orders of magnitude worse), and it also has certain requirements for the configuration of the whole machine. Therefore, it is very necessary for friends who want to play AI drawing to upgrade or simply buy an efficient new computer.
  Further reading: Precautions for local deployment
  of StableDiffusion There are actually a lot of local deployment tutorials for StableDiffusion, just search for a lot, here we just briefly summarize a few points that you need to pay attention to.
  ● GIT needs to be installed, which is one of the prerequisites for installing StableDiffusionWebUI. Download address: https://git-scm.com/download/win
  ●Python3.10 operating environment needs to be installed. Miniconda download address: https://docs.conda.io/en/main/miniconda.html
  ●The latest version of cuDNN (CUDA Deep Neural Network Library) needs to be installed to use the NVIDIA card. Only the latest version can make the RTX40 series play the real performance. Download address: http://go.cpcw.com/cudnn2023
  If you think it is too troublesome to deploy by yourself and feel unable to start, you can also download the integration package directly, unzip it and use it. The integration package not only includes the operating environment required by StableDiffusion (if it is not running properly, you can also install GIT and Python according to the previous prompts), but also sets some preset solutions to take care of users with different configurations, especially different video memory capacities, and it is really easy to use. More convenient, especially suitable for beginners.

  For example, the shared integration package made by the UP master “Autumn Leaf aaaki” of station B has been upgraded to the V4 version, which supports CPU and NVIDIA graphics card CUDA acceleration. You can go to station B to download it yourself. In addition, at present, AMD graphics cards can only achieve StableDiffusion AI computing acceleration through DirectML under Windows (the ROCm open software platform for GPU computing is supported in the Linux environment, and ROCm will soon be launched on Windows), and the efficiency is far inferior to CUDA of NVIDIA graphics cards, so there are only a few The integration package has added the DirectML library required by AMD graphics cards (users can add it by themselves), and the same is true for Intel graphics cards. Therefore, NVIDIA graphics cards are almost the only recommended choice for local AI graphics.
How to choose computer accessories for local AI drawing?

  It has been briefly mentioned above that the local AI drawing mainly relies on GPU computing, but considering that the entire design process does not only include AI drawing or training models, or that the user installs a computer not only for AI applications, it may also include other Design or productivity application requirements, so there are certain requirements for the selection of machine accessories, mainly involving processor, memory and graphics card, and as long as the disk is SSD, it is enough to store materials and model files, there is no special requirement.

  Processor
  In fact , even a flagship processor like Intel Core i913900K, the efficiency of drawing images in StableDiffusion is only about 1/256 of NVIDIA graphics card flagship RTX4090, 1/141 of RTX3090Ti, and a 768×768, sampling A map with 50 steps takes nearly 13 minutes, which is really not very efficient. So only when the computer does not have a graphics card to support hardware acceleration in StableDiffusion, it will be the turn of the processor to temporarily substitute. Therefore, if it is only installed for AI image output, there is actually no high requirement for the performance of the processor.
  However, even if there is no special requirement for the processor, firstly, considering that the configuration of the entire computer should not be too old, it must have sufficient scalability and upgradeability; secondly, considering that the whole machine may also complete other productivity tasks, So in terms of processors, we have developed different selection directions.

  ●Focus on cost-effectiveness, a thousand-yuan U is enough
  If there are no other requirements for large-scale productivity applications, then the latest thousand-yuan U is enough to meet the installation requirements. Intel can consider the Core i513400F, and AMD can consider the Ryzen 57600 Smart Edition. Why not use the cheaper U? On the one hand, these two models can support PCIe5.0 and DDR5 memory. In the future, the platform will upgrade to PCIe5.0 graphics card (this is very important. According to the current practice of NVIDIA and AMD, there is a high probability that the low-end graphics card will only support 8 cards in the future. PCIe channel, if the processor or motherboard does not support PCIe5.0, the bandwidth will be halved) and SSD, expanding memory capacity is more convenient; on the other hand, their performance is also capable of handling mainstream productivity applications, and at the same time It is more affordable in terms of price.

  ●Focus on versatility, consider high-end multi-core U.
  If the computer needs to complete supporting productivity work (video editing, 3D rendering output) other than AI drawing, it puts forward higher requirements for the multi-thread performance of the processor. Go for a higher-end model with a higher core count. Here we give priority to AMD’s Ryzen 7000 series flagship models, such as Ryzen 97950X and Ryzen 97900X, followed by Intel Core i913900 series and Core i713700 series.
  The reason is that we found in the actual measurement that the 13th-generation Core with a large number of energy-efficient cores will have incorrect core allocation when running some productivity software including StableDiffusion, resulting in all heavy-load processes being allocated to energy-efficient cores. , the performance core is in an idle state, which will greatly reduce the computing efficiency of the processor. In contrast, the Ryzen 7000 series, which are all large cores, will not have such a problem.

  Memory
  Now that it has been decided to choose Ryzen 7000 or the 13th generation Core platform, DDR5 memory is also a must. Although the 13th generation Core also supports DDR4 memory, considering the future upgrade space of the whole machine and the demand for other productivity applications, it is obviously more appropriate to choose DDR5 memory with higher bandwidth.
  In terms of capacity, if StableDiffusion does not use a processor to produce images, it does not consume much memory, 32GB is more than enough. At present, the price of memory is relatively good, and the dual 16GB is not stressful for the mainstream machine, and it can also deal with more productivity applications, so it is strongly recommended to save money and use dual 16GB.
  For the frequency part, although higher memory bandwidth can indeed bring higher productivity efficiency, the cost performance issue must also be considered comprehensively. Memory exceeding DDR56400 is still quite expensive, and DDR56400/6000 is the model with the highest comprehensive cost performance. In addition, if you choose the AMD Ryzen 7000 platform, then DDR56400 is also the limit, and you don’t need to consider higher models.
  Graphics card
  Finally, it comes to the protagonist accessories of StableDiffusion. As mentioned earlier, StableDiffusion prefers NVIDIA graphics cards that are irreplaceable in the CUDA ecosystem for local output. AMD graphics cards and Intel graphics cards that use DirectML to achieve general computing are both substitutes (you can expect that there will be no AMD graphics cards after the Windows version of ROCm is installed. improved), efficiency and compatibility are difficult to compare with NVIDIA graphics cards, and it is better to be honest and practical with N cards than to spend time solving various problems of A cards and I cards in StableDiffusion.
  In addition to the computing power of the GPU, the most important thing for StableDiffusion’s local output is the video memory. The larger the video memory, the higher the output resolution can be set. Take the Vincent graph test at the back of this article as an example, 8GB is recommended for beginners, and the resolution of 768×768 is enough (about 6.9GB at most); for higher 1024×1024, it is recommended to choose a model with 12GB of video memory (about 9GB at most) ; Further up is the 16GB/20GB/24GB model. Of course, there are also magically modified RTX2060/2080Ti in the market that can achieve 12GB/22GB video memory, but it is obvious that this type of card has no warranty, so it is not recommended for ordinary users to take risks. However, professional computing cards with massive video memory (such as NVIDIA A10080GBPCIe) are not affordable for mass players, so I won’t say more here.

  In addition, “antique-grade” pure computing cards such as NVIDIA Tesla P40/M40 are also the focus of attention of AI graphics players recently. They have a large 24GB video memory, and the price of second-hand cards is also very attractive (M40 only costs 499 yuan, and P40 also increases by one yuan. Bo has soared from 899 yuan to 1199 yuan). However, this type of computing card is passive heat dissipation. Players need to manually change the heat sink when they get it, and the old Pascal and Maxwell architecture also consumes a lot of power. Not only is it costly to change the heat dissipation, but it is also difficult for ordinary users to HOLD. of. And the most important point is that the old architecture does not support FP16 half-precision calculations, and StableDiffusion can greatly improve efficiency and save video memory usage in half-precision mode, which also greatly reduces the value of these old and used cards. In addition, there is no reliable Warranty, so it is not recommended for ordinary players to toss.
  Of course, there are also some ways to reduce the memory usage of StableDiffusion, so that some graphics cards with insufficient memory can also support higher resolution graphics, such as Xformers, MultiDiffusionwithTiledVAE, which is not within the scope of this article. Interested friends can Do your own research.
  So, what is the efficiency of the available NVIDIA graphics cards we are using in StableDiffusion? We have also conducted a horizontal test on this, you can refer to it.
  As shown in the figure, we can see our StableDiffusion output setting. By default, we use the CKPT model officially provided by NVIDIA. The sampling method is Eulera, the number of sampling steps is set to 50, the CFGScale (prompt word correlation) is set to 7.5, and the generated batches are set to 10. The number of generation per batch is 2, and the image resolution is 768×768. The prompt words are: “beautifulrenderofaTudorstylehousenearthewateratsunset, fantasyforest.Photorealistic, cinematiccomposition, cinematichighdetail, ultrarealistic, cinematiclighting, DepthofField, hyper-detailed, beautifullycolor-coded, 8k, manydetails, chiaroscurolighting, ++d reamlike, vignette”.

  Judging from the test results, except for the Core i913900K whose data bar is too short to be seen, the output efficiency of the graphics card is basically proportional to the price, but the RTX4070 is still slightly better than the more expensive RTX3080 of the previous generation, and the RTX4070 itself has 12GB of video memory is also more advantageous than the 10GB of RTX3080. However, although RTX4070Ti is slightly faster than RTX3090 under such settings, don’t forget that RTX3090 has a massive 24GB video memory, which is twice as much as RTX4070Ti’s 12GB, so after the image resolution is increased to a certain level, it can definitely overtake. As for the GTX1660Ti at the bottom, since the video memory is only 6GB, the video memory has already exploded in the test, so the efficiency is significantly lower, but even so, the speed is 7.7 times that of the processor.
  On the whole, for mainstream users, RTX3060 is actually a relatively outstanding choice for comprehensive cost performance. Although the computing power cannot be compared with high-end cards, but fortunately, it has 12GB of large video memory, which is even larger than some high-end cards. memory condition. Of course, in the case of sufficient funds, the more advanced the graphics card, the better, but you need to pay attention to the matching of the power supply. After all, the GPU is fully loaded when the picture is produced. If the power supply is not enough, it must be shut down.
AI “designer” installed recommended, there is always a suitable for you

  After the previous analysis, I believe that everyone has already understood the hardware requirements of StableDiffusion. If you still don’t know how to choose specific installation accessories, we also give three sets of plans for low, middle and high for you to find out. In addition to meeting StableDiffusion’s local AI drawing needs, these three solutions can also cover different levels of productivity applications. Of course, if you want to use it to play games, the experience is also good. After all, “productivity before buying, after buying…”, everyone understands.
  Basic model: AI drawing + light productivity application
  For mainstream users, the solution equipped with Qianyuan U and RTX3060 graphics card is completely sufficient. Although we said earlier that the design of the 13th-generation Core’s large and small cores will cause some productivity application loads to only occupy small cores and affect efficiency, AI graphics mainly rely on graphics cards, and the Core i513400F at the level of 1000 yuan is indeed the latest generation of U mid-range. The cost-effectiveness is the most prominent. What’s more, we can manually allocate the load to the large core to solve the problem of core allocation.

  The graphics card part is of course the focus of AI drawing. From the previous tests, we can see that the drawing efficiency of RTX3060 is only slightly lower than that of RTX3060Ti, but it has 12GB of large video memory, which can support higher resolution drawing, and the price is also lower than that of RTX3060Ti. Quite a few, if you look at the needs of AI drawing, it is indeed more cost-effective to choose RTX3060 for mainstream configuration. Here we choose the Gigabyte RTX3060 Fengma 12GB video memory version (note that RTX3060 also has 8GB video memory version, don’t buy it wrong), the dual fan design can completely hold the heat dissipation, and you don’t have to worry about stability when calculating the graph for a long time.
  Mainstream model: All-round designer computer
  The all- round designer computer takes more into account the user’s demand for productivity applications other than AI drawing, so the processor has chosen the full-core, 12-core, 24-thread Ryzen 97900 Zhiku version, there is no problem of incorrect process allocation, and at the same time, the full-load power of the Ryzen 97900 without X is only 90W without PBO, and the performance is less than 10% behind that of the Ryzen 97900X. Whether it is video editing or 3D rendering output, it can provide good efficiency.

  The graphics card part is directly equipped with the latest RTX4070. In terms of computing power, RTX4070 has surpassed the 10GB video memory version of RTX3080, and its full-load power is only 200W. The requirements for heat dissipation and power supply are lower than those of RTX3080. The drawing is more stable and energy-saving. At the same time, the RTX4070 has 2GB more video memory, and the price is also lower. On the whole, it is obviously more worthy of choice than the RTX3080. Gigabyte RTX4070 Fengmo is also a representative of dessert-grade RTX4070 that takes the cost-effective route. The 3-fan heat dissipation design has no pressure on the RTX4070 with a power of 200W, and the noise can also be controlled to a lower level.
  Flagship model: high-efficiency productivity weapon The
  flagship configuration can be said to be a set of high-efficiency productivity weapon. The flagship U Ruilong 97950X with 16 cores and 32 threads has a full large-core design without worrying about core allocation. The adaptability advantage of being able to manually allocate 8 large cores is obvious. From this point of view, it is indeed more dominant than the Core i913900K with more cores.

  For the graphics card part, we choose RTX4080 here, which has 16GB of large video memory, and the high-resolution AI drawing is more efficient. It can also be seen from the previous tests that the AI ​​drawing efficiency of RTX4080 has surpassed that of RTX4070Ti by about 30%. At the same time, the flagship N card of the RTX4080 class has very powerful acceleration capabilities in video editing and 3D design, and is especially suitable for users with high-end design needs. The RTX4080 Falcon is a cost-effective model in the GIGABYTE RTX4080 graphics card family. It achieves flagship power supply and heat dissipation specifications at a price of less than 10,000 yuan. You don’t have to worry about heat dissipation and stability when you are fully loaded for a long time. The purchase value of the entire flagship configuration has also been further improved.

Summary: Get ready to “fight the future” and evolve with AI

  There was ChatGPT before, and then there was AI drawing. The sudden popularization of AI applications this year made us feel the strong impact of the AI ​​era. AI has even partially replaced manual labor in some applications (especially visual design and copywriting planning). . However, AI, as a productivity tool, originally serves human beings. What we have to do now is to learn how to use this efficient tool. In the future, using AI to complete various tasks will be as popular as using Office software for office work now, and it may also become an essential skill for employment. Therefore, if you want not to fall behind the times, evolve with AI and get ready to “fight the future”!

error: Content is protected !!