Revolutionizing Digital Image Processing with Novel AI-Developed DragGAN – AZoRobotics

Team ZYT Web3 / 1 year
July 3, 2024
0
6 min read

Venture into the captivating world of AI-driven image generation with our blog series. As we delve into the realm of Generative Adversarial Networks (GANs) and other cutting-edge algorithms, discover how artificial intelligence is reshaping the landscape of visual content creation. From realistic image synthesis to artistic exploration, these articles aim to unravel the transformative impact of AI on pushing the boundaries of creativity and innovation in the field of digital imagery. Whether you are an enthusiast, a creative professional, or simply curious about the future of visual arts, join us in exploring the fascinating synergy between AI and image generation.
A novel AI tool promises that with just a few mouse clicks, anyone can do previously difficult-to-achieve photo edits easily.

Various images processed using the DragGAN method. Image Credit: MPI-INF
The technique is being developed by a research group headed by the Max Planck Institute for Informatics in Saarbrücken, in particular by the Saarbruecken Research Center for Visual Computing, Interaction and Artificial Intelligence (VIA) situated there.
This innovative technique can revolutionize digital image processing.
With ‘DragGAN,’ we are currently creating a user-friendly tool that allows even non-professionals to perform complex image editing. All you need to do is mark the areas in the photo that you want to change and specify the desired edits in a menu. Thanks to the support of AI, with just a few clicks of the mouse anyone can adjust things like the pose, facial expression, direction of gaze, or viewing angle, for example in a pet photo.
Christian Theobalt, Managing Director, Max Planck Institute for Informatics, Director, Saarbrücken Research Center for Visual Computing, Interaction and Artificial Intelligence, and Professor, Saarland University
This is facilitated via the use of artificial intelligence, particularly a kind of model known as “Generative Adversarial Networks” or GANs. “As the name suggests, GANs are capable of generating new content, such as images. The term ‘adversarial’ refers to the fact that GANs involve two networks competing against each other,” explains Xingang Pan, a postdoctoral researcher at the MPI for Informatics and the first author of the paper.
A GAN comprises a generator that is accountable for creating images, and a discriminator, whose task is to identify if an image is real or produced by the generator.
These two networks, occupied in tandem, are subjected to training until they reach a point where the generator produces images that the differentiator cannot distinguish from real ones.
GANs are utilized for several purposes. For instance, besides the evident use of image generators, GANs are good at forecasting images, allowing video frame prediction.
This has the potential to decrease the data needed for video streaming by anticipating the next frame of a video. Or they could upscale low-resolution images, thereby enhancing image quality by computing where the extra pixels of the new images must go.
In our case, this property of GANs proves advantageous when, for example, the direction of a dog's gaze is to be changed in an image. The GAN then basically recalculates the whole image, anticipating where which pixel must land in the image with a new viewing direction.
Xingang Pan, Study First Author and Postdoctoral Researcher, MPI for Informatics, Saarland University
Pan added, “A side effect of this is that DragGAN can calculate things that were previously occluded by the dog's head position, for example. Or if the user wants to show the dog's teeth, he can open the dog’s muzzle on the image.”
Also, DragGAN can determine applications in professional settings. For example, fashion designers could make use of its features to make adjustments to the cut of clothing in photographs following the initial capture.
Besides, vehicle manufacturers can effectively explore various design configurations for their planned vehicles. While DragGAN functions on various object categories like cars, animals, people, and landscapes, the majority of the outcomes are achieved on GAN-generated synthetic images.
How to apply it to any user-input images is still a challenging problem that we are looking into.
Xingang Pan, Study First Author and Postdoctoral Researcher, MPI for Informatics, Saarland University
After just a few days post-release, the new tool developed by the Saarbrücken-based computer scientists is already leading to a stir in the international tech community and is being examined by many as the next big step in AI-assisted image processing. While tools like Midjourney could be utilized to create entirely new images, DragGAN could greatly ease their post-processing.
The new technique is being developed at the Max Planck Institute for Informatics in collaboration with the “Saarbrücken Research Center for Visual Computing, Interaction, and Artificial Intelligence (VIA),” which was opened there in collaboration with Google. Also, the research consortium includes experts from the Massachusetts Institute of Technology (MIT) and the University of Pennsylvania.
Besides Professor Christian Theobalt and Xingang Pan, contributors to the paper entitled “Drag Your GAN: Interactive Pointbased Manipulation on the Generative Image Manifold” were: Thomas Leimkuehler (MPI INF), Lingjie Liu (MPI INF and University of Pennsylvania), Abhimitra Meka (Google), and Ayush Tewari (MIT CSAIL). The paper has got approval from the ACM SIGGRAPH conference, known to be the world’s largest professional conference on computer graphics and interactive technologies, to be performed in Los Angeles, August 6-10, 2023.
Source: https://www.uni-saarland.de/en/home.html
Do you have a review, update or anything you would like to add to this news story?
Cancel reply to comment
As part of an Editorial short series, AZoRobotics takes a look at how the renewable energy sector is harnessing the power of robotic technologies. Here, we take a look at robots in hydropower stations.
Amid the rapid global expansion of the wind energy sector, the integration of robotics is becoming pivotal for wind farm operators.
Like any renewable energy infrastructure, solar plants must be protected and secured. It is here where robots and autonomous systems can come into play.
Discover Cavitar’s welding cameras that can be used in a variety of situations to offer high-quality visualization of the welding processes.
MTI’s 1510A portable signal generator and calibrator is ideal for testing the integrity of sensor signal conditioning electronics.
Using the advantages of the phased array technology, Olympus has designed a powerful inspection system for seamless pipe inspections well-adapted to the stringent requirements of the oil and gas markets. This phased array system is flexible and can be used to match inspection performances and the product requirements of customers.
AZoRobotics.com – An AZoNetwork Site
Owned and operated by AZoNetwork, © 2000-2024

source

Amazon Releases AI Chatbot...

OpenAI’s Study Mode: AI...

Instagram and WhatsApp's shared...

Battle of AI chatbots:...

10 Best AI WhatsApp...

Apple's incorrect assumption about...

Revolutionizing Digital Image Processing with Novel AI-Developed DragGAN – AZoRobotics

Amazon Releases AI Chatbot ‘Rufus’ for.

OpenAI’s Study Mode: AI Tutoring for.

Instagram and WhatsApp's shared AI chatbot.

Battle of AI chatbots: Gemini, ChatGPT.