Why not use a camera that massively improves the camera capability of the iPhone for device photography?

At the introduction of the iPhone 11 series, Philip W. Schiller, senior vice president of Apple, incorporated the notion of machine photography for the first time when unveiling the iPhone 11 Pro series imaging framework. This term was also the first widely recognized word.

Currently, the concept of computational photography is not new. In 1994, it first appeared in a public paper, and it was determined that HDR, panoramic images, and artificial bokeh in-camera synthesis belong to the computational photography genre. At the time, though, the mainstream picture carrier was only video, and digital photography had just began, and no cameras were available on cell phones.

Philip W. Schiller, at the iPhone 11 Pro conference, who invented computational photography. Photo from: Apple

Decades later, from video to digital, the carrier of image capturing has shifted, smart phones have cameras, and electronic imaging has evolved from theory, becoming a significant movement steadily.

This pattern has nothing to do with cameras, though. Camera makers continue to steadily develop the capacities of pixels, constant shooting speed and film. It appears they are oblivious to machine imaging. The pictures taken (straight out) are still very mediocre and smartphones are increasingly using them. “Beyond”.

On the opposite, mobile phone chips’ computational ability is becoming greater and stronger, with a broader variety of AI, algorithms, and machine learning included. There are more and more deductive image approaches, and a set of “algorithms.” are finally manipulating images. It’s also getting easier.

Nowadays, more people are more likely to use cell phones to capture and upload, and cameras are becoming less popular. The business success of the two also illustrates this. The demand for smartphones has risen exponentially, and year by year, the camera market has diminished. Even the DC (card camera) eventually went missing.

Some people may wonder at this time, if the images taken by smartphones look and sound so good, why don’t conventional camera manufactures embrace the machine photography trend and consider enhancing the basic look and feel of photos?

Is there not enough processing power on the camera to count it?

Let’s start from the “core” with this issue.

SoC, which includes the CPU, GPU, ISP, NPU, and baseband, is the heart of a cell phone. It makes it easy for you to make calls, take photographs, watch videos, play sports, and browse the web. It also directly determines the cell phone’s output.

The image sensor (CMOS), which is identical to a cell phone, with the exception of the component region, is the central component of the camera for imaging and light sensitivity. In addition, the central processing chip that manages the whole camera system is considered an image processor.

Take the BIONZ X image processor from Sony as an example (alp7 sequence royal). It requires chips from SoC and ISP. It does not combine SoC with ISPs. The bonus is that, according to the performance criteria of CMOS, Sony will maximize the number of ISP chips. (Alp7RIII’s BIONZ X is fitted with a dual ISP). The downside is that the degree of convergence is not as high as that of cell phones.

In BIONZ X, the function of the SoC is close to that of a cell phone. The output specifications are not strong for regulating the functions of the control interface and camera. Conduct Bayer conversion, demosaicing, noise reduction, sharpening and other operations on the “data” gathered by the image sensor, often based on the ISP, and gradually transform the data gathered by CMOS into the camera’s real-time vision. The camera’s ISP does not include the measurement process in this process, but considers the images as goods for centralized processing on the assembly line.

The image sensor of the camera has a high demand for the speed and throughput of image processing, with the continuous enhancement of the number of pixels, continuous shooting speed and video output of the current camera, and the single data volume is very large, without the intervention of’ measurement.’ The processing capacity of the camera image processor much exceeds the existing smart phone ISP’s processing power.

Yet there is something new when it comes to computational imaging, or AI capabilities. A smartphone’s imaging process is very close to that of a camera, except before the final image is displayed, ISP and DSP calculations, real-time adjustment and optimization are also needed, particularly after the multi-camera device becomes popular, the cell phone’s measurement data volume doubles.

Behind the fast and seamless switching of the multi-camera system is the immense data processing capability of the two newly added machine learning accelerators in A13 Bionic, exceeding one trillion times per second, after the iPhone 11 Pro series introduced the multi-camera system. The enormous amount of data produced by three cameras can be regarded as eating up the high-frequency and powerful data processing capability.

The camera image processor often preprocesses the initial images, and there is virtually no calculation process, whereas the SoC cell phone requires preprocessing and subsequent calculation processes for data collection, all of which rely in different ways.

The product of market segmentation, faced with various categories,

Mobile device photography has quickly advanced. The root cause is that the scale of the cell phone’s imaging sensor (CMOS) is too small. With current technologies, you can only refine it by algorithms that make it seem straightforward whether you want to physically exceed or enter the camera. HDR, mega night scene, wide analog aperture, magic sky shift and other features.

  ▲ Take a picture of the “calculation” process done by the iPhone. Picture from: Apple

  However, it is still difficult for the interpretation of these algorithms to be able to “personalize” intervention, such as to what extent the filter is added, and to what extent the HDR highlights and shadows are retained. However, for mobile phones facing the masses, as far as possible, let most people take good photos, which is more in line with the market positioning and crowd positioning of mobile phones.

  Since the invention of the camera, the camera has an absolute “tool” attribute. In order to be efficient, the appearance, control, function, etc. will all compromise to efficiency. Facing a niche professional group, it will naturally be more in line with their needs. Cameras will record color depth, color, light and other information as much as possible, so that users can make a wider range of post-adjustments to determine whether it is good or not. Not in their needs.

  ▲More information is recorded in the RAW file, allowing a wider range of adjustments. Picture from: Ben Sandofsky

  For most people who don’t have a foundation in photography, getting a good-looking photo at hand is far more important than getting an informative photo. For professional camera manufacturers, increasing the color depth of RAW recording is more in line with market positioning than increasing the straight-out effect of JPG.

  However, things are not so absolute, and cameras are also trying to change. Fuji has always been committed to the straight-out effect of the camera, introducing “film simulation”, through different algorithms, to make the photos more tasteful and look better. However, this process does not go through scene calculations, but requires users to choose by themselves. This is similar to some film simulation apps on mobile phones and does not involve so-called “computational photography”.

  After AI, is the general direction of the camera?

  In the field of photography, post-processing is an indispensable step. On the one hand, post-processing software can make full use of the rich information recorded in the RAW format. On the other hand, it can also use the high performance and computing power of the PC to quickly process photos.

  Unlike camera manufacturers, almost mainstream professional post-production software has begun to work on AI, emphasizing the processing capabilities of AI.

  ▲ The later software Luminar 4 supports AI automatic day change. Picture from: Luminar

  Adobe’s Photoshop has added an automatic recognition function to the operations of cutout, repair, and dermabrasion in recent versions of the update, making the operation more and more mindless and the effect more and more accurate. The Pixelmator Pro retouching software on the Mac platform began to use Apple’s Core ML machine learning to recognize images as early as 2018, so as to perform color adjustment, matting, selection, and even compression output. ML machine learning was used. engine.

  ▲ Image editing of Pixelmator Pro 2.0 supports machine learning engine. Picture from: Pixelmator

  As mentioned in the previous article, due to the limitation of chip AI computing power and the problem of niche market, camera manufacturers have hardly made any effort in computational photography. However, the explosion of later software in AI can be regarded as making up for the shortcomings of cameras in computational photography.

  Even with the AI ​​of the later software, the cameras still haven’t gotten rid of the traditional process. The cameras record and the software handles. This process is still cumbersome for the public. For professional photographers, the intervention of later software AI can indeed reduce the workload and make the original complicated cutout operations a lot easier, but it still cannot reverse the photo processing (creation) process of the traditional photography industry, which is completely different from the mobile phone. different.

  ▲ The global digital camera shipments in September 2020 are far less than 2018. Picture from: CIPA

  According to CIPA data, the camera market is gradually shrinking, on the contrary, the mobile phone market continues to grow. The trend of “computational photography” on smartphones will not change the direction of cameras becoming more professional, nor will it reverse the gradual shrinking of the camera market.

  In other words, even if cameras now have “computational photography” capabilities that are close to smartphones, can they save the “deteriorating” camera market? The answer is of course no. For an extreme example, if it is feasible, Fuji Camera will have the first market share. In fact, the top spot for mirrorless cameras is now occupied by Sony, which is not pretty straight out.

  ▲ Sony Micro-Single has become a working machine for many studios. Picture from: SmallRig

  In the face of menacing mobile phones, cameras can only develop in a more professional direction and continue to subdivide the market upwards. In recent years, the full-frame 40 million and 60 million high pixels, medium-format over 100 million pixels, and micro-single Video capabilities continue to approach professional camcorders, which are all products of the camera segment.

  The increasing specialization of cameras means the need for better-performing image sensors (CMOS), but “computational photography” relies on a separate machine learning module. As we all know, the high cost and high risk of chip development make it difficult for camera manufacturers. Take care of both. Computational photography and development of specialization are two different paths. At the same time, for features such as “computational photography” and “AI intervention” that are of little use to professional users, camera manufacturers are likely to be strategically abandoned temporarily due to balancing research and development costs.

  At this stage or in the foreseeable future, it is even more difficult to want camera manufacturers to embrace “computational photography” with high risk, high investment, and slow results, not to mention that there are still many professional post-production software using AI to retouch photos. At the end.