{"id":573,"date":"2022-11-11T06:00:00","date_gmt":"2022-11-11T06:00:00","guid":{"rendered":"https:\/\/blueandread.asbarcelona.com\/?p=573"},"modified":"2022-11-11T06:07:38","modified_gmt":"2022-11-11T06:07:38","slug":"this-class-does-not-exist","status":"publish","type":"post","link":"https:\/\/blueandread.asbarcelona.com\/?p=573","title":{"rendered":"This Class Does Not Exist"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><em>At the end of this article, you will find the link to a Real or Fake game. Will you be able to distinguish between human creations and AI generations?<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This class does not exist. Yes. You read that correctly. That classroom does not exist. It isn\u2019t a drawing either, nor is it some sort of video game. It was generated in under 30 seconds after I asked an AI to show me a \u201crealistic photograph of a High School classroom, with school desks and chairs. A whiteboard in the back.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Earlier this year, OpenAI released the second iteration of their scarily good image-generating AI: DALL\u00b7E 2. DALL\u00b7E 2 is a system that receives a text input (such as \u201ca marble statue of an apple falling from a tree\u201d) and then creates an image based on that input. The key word here is <em>create<\/em>. The AI does not simply mimic a search engine and find images on the internet of the user\u2019s&nbsp; prompt. DALL\u00b7E 2 actually <em>creates<\/em> the image from scratch, meaning the user will \u201creceive\u201d something that has never existed before.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This younger yet smarter sibling of DALL\u00b7E (which was released in 2021) doesn\u2019t just output higher-resolution images, but replaces objects within an image you upload or even expands it. DALL\u00b7E 2 is also better at understanding prompts and combining them in images.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now, you might be wandering, how does it all even work? Officially, it is a complicated matter of contrastive language-image pre-training, paired with a diffusion model to arrive at the final product. Now, unless you are an engineer at NASA (or OpenAI), the definition of \u201ccontrastive language-image pre-training\u201d probably doesn\u2019t immediately come to mind. Therefore, I\u2019ll try to explain the process in a more approachable way.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Let\u2019s say you\u2019re a computer scientist that wants to teach a computer to recognize fruits in different images. Before you can submit an image in order for it to decide what fruit it is, it needs to learn, or train its neural network, to know what each fruit looks like. Traditionally, you would gather a bunch of images of apples, for example, and put them all in the \u201cApples\u201d category. The computer then looks at all of these images and tries to find patterns between them. It might realize, for example, that most of the \u201cApples\u201d are red, or notice their similar shape. You would do the same thing with bananas. It would look at all the images classified as \u201cBananas\u201d by you and notice they are all long and yellow.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">After training the computer with these images, it would be ready to decide the category of a new image. You decide to submit an image of an apple and ask the computer what category it belongs to. Although the computer has never seen <em>this<\/em> particular image of an apple before, it recognizes how it is red, and how its shape is more consistent with that of the images in \u201cApples\u201d than \u201cBananas.\u201d It therefore decides that the best category for this image is \u201cApples\u201d. The best part? The computer can now train itself based off of this new image! It can keep improving at recognising the difference between the \u201cApples\u201d and \u201cBananas\u201d categories, without the need of any human input.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is where the cleverness of the DALL\u00b7E 2 system really shines. While the computer was being trained, researchers showed it several millions of random images from the internet. The catch? They didn\u2019t classify them with generic categories like \u201ctree\u201d or \u201ckeyboard.\u201d They instead showed the computer the original captions for those images. So, if an image was found on a gardening blog, they didn\u2019t intercept it and label it with \u201cTree,\u201d They simply let it through with its original caption from the blog, which may have been \u201coak tree in autumn.\u201d The same would happen for other images. Let\u2019s say another image was shown to the computer from the Department of Agriculture, this one captioned \u201ctall maple tree in the summer.\u201d The computer doesn\u2019t really know what the words in the caption actually correlate to in the image, but after being shown many more pictures with the word \u201ctree\u201d in the caption, it may start to realize that most of the images that include <em>tree <\/em>in the caption all contain a big body of mass with leafy arms. It\u2019s essentially creating a description of what a \u201ctree\u201d looks like. Similarly, the computer is also being shown images captioned as \u201cbeautiful autumn landscape\u201d from Instagram. The computer has no idea what \u201cbeautiful\u201d or \u201clandscape\u201d mean (yet), but it remembers that image it saw of an \u201coak tree in <em>autumn<\/em>.\u201d It might notice how the trees in both images have a red tone which leads it to assume autumn means red trees.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But how do we get from this to an image? Now that the AI can associate words to parts of an image, it begins by making random modifications to an image of noise. After each iteration, it checks for anything that could associate it with words in the prompt. If so, it will keep making that part of the image more clear, until there is no noise left.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" src=\"https:\/\/blueandread.asbarcelona.com\/wp-content\/uploads\/2022\/11\/Video_20221110_232707_547.mp4\" alt=\"\" class=\"wp-image-592\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The advantages of this system over the original are clear and varied. Firstly, we don\u2019t need a human slowly classifying images to teach the computer. The computer can teach itself! Secondly, there can be infinite categories, as they are no longer up to humans. Both of these, when applied to DALL\u00b7E 2, mean more detailed images, better understanding of the prompt, and that it can continue learning on its own.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Oh, remember how I said you can also expand images? That\u2019s called Outpainting. Here is that same classroom from the thumbnail.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"512\" src=\"https:\/\/blueandread.asbarcelona.com\/wp-content\/uploads\/2022\/11\/DALL\u00b7E-2022-10-13-09.33.00-A-realistic-photograph-of-a-High-School-classroom-with-school-desks-and-chairs.-A-whiteboard-in-the-back.-1024x512.png\" alt=\"\" class=\"wp-image-575\" srcset=\"https:\/\/blueandread.asbarcelona.com\/wp-content\/uploads\/2022\/11\/DALL\u00b7E-2022-10-13-09.33.00-A-realistic-photograph-of-a-High-School-classroom-with-school-desks-and-chairs.-A-whiteboard-in-the-back.-1024x512.png 1024w, https:\/\/blueandread.asbarcelona.com\/wp-content\/uploads\/2022\/11\/DALL\u00b7E-2022-10-13-09.33.00-A-realistic-photograph-of-a-High-School-classroom-with-school-desks-and-chairs.-A-whiteboard-in-the-back.-300x150.png 300w, https:\/\/blueandread.asbarcelona.com\/wp-content\/uploads\/2022\/11\/DALL\u00b7E-2022-10-13-09.33.00-A-realistic-photograph-of-a-High-School-classroom-with-school-desks-and-chairs.-A-whiteboard-in-the-back.-768x384.png 768w, https:\/\/blueandread.asbarcelona.com\/wp-content\/uploads\/2022\/11\/DALL\u00b7E-2022-10-13-09.33.00-A-realistic-photograph-of-a-High-School-classroom-with-school-desks-and-chairs.-A-whiteboard-in-the-back.-1536x768.png 1536w, https:\/\/blueandread.asbarcelona.com\/wp-content\/uploads\/2022\/11\/DALL\u00b7E-2022-10-13-09.33.00-A-realistic-photograph-of-a-High-School-classroom-with-school-desks-and-chairs.-A-whiteboard-in-the-back..png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Notice how the center of the image is the same, while more has been added to the left and right edges. However, why does this all matter? Sure, it\u2019s pretty cool tech, but it\u2019s an even better tool. Anyone, including you, can head over to <a href=\"https:\/\/openai.com\/dall-e-2\/\">the DALLE\u00b72 website<\/a>, sign up, and start generating images immediately (up to 15 per month with 50 as a sign-up bonus). All for free. You legally own the images generated too, so you are free to use them in your own works, websites, and even use them commercially!<br>For example, a few weeks ago my Catalan class at ASB was given an assignment to write a story. After my group had drafted a few paragraphs, we began to play around with the idea of illustrations. They were not required, but I felt they would help add to the atmosphere and the reader\u2019s imagination, and be a fun addition. Now, due to my group\u2019s lack of artistic abilities, outsourcing would be required. So, I turned to a little AI I had recently been granted access to. I wasn\u2019t very hopeful that the AI would be able to both capture the details I wanted in the drawing <em>and<\/em> make it <em>look<\/em> like a drawing as opposed to a photo, but I still gave it a shot. Dispiritedly, I asked it to generate \u201cA Ford Ranger F100 driving on a desert road with 3 passengers in the pickup, one driver, digital art.\u201d The results, delivered in a few seconds, were glorious.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"1024\" src=\"https:\/\/blueandread.asbarcelona.com\/wp-content\/uploads\/2022\/11\/DALL\u00b7E-2022-10-29-13.22.57-A-Ford-Ranger-F100-driving-on-a-desert-road-with-3-passengers-in-the-pickup-one-driver-digital-art.png\" alt=\"\" class=\"wp-image-576\" srcset=\"https:\/\/blueandread.asbarcelona.com\/wp-content\/uploads\/2022\/11\/DALL\u00b7E-2022-10-29-13.22.57-A-Ford-Ranger-F100-driving-on-a-desert-road-with-3-passengers-in-the-pickup-one-driver-digital-art.png 1024w, https:\/\/blueandread.asbarcelona.com\/wp-content\/uploads\/2022\/11\/DALL\u00b7E-2022-10-29-13.22.57-A-Ford-Ranger-F100-driving-on-a-desert-road-with-3-passengers-in-the-pickup-one-driver-digital-art-300x300.png 300w, https:\/\/blueandread.asbarcelona.com\/wp-content\/uploads\/2022\/11\/DALL\u00b7E-2022-10-29-13.22.57-A-Ford-Ranger-F100-driving-on-a-desert-road-with-3-passengers-in-the-pickup-one-driver-digital-art-150x150.png 150w, https:\/\/blueandread.asbarcelona.com\/wp-content\/uploads\/2022\/11\/DALL\u00b7E-2022-10-29-13.22.57-A-Ford-Ranger-F100-driving-on-a-desert-road-with-3-passengers-in-the-pickup-one-driver-digital-art-768x768.png 768w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The AI had not only created the super specific car model, a Ford Ranger F100, but it had also nailed the digital art aesthetic. From the gentle gradient in the sky to the photorealistic shadows cast by the car, the \u201cdrawing\u201d appeared virtually perfect. This image, which would have taken my friends and me (or even a professional digital artist) hours to draw, was now mine to use however I pleased.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So, here we are, in an era where AIs can draw with almost the same fidelity as humans, and create photorealistic photos from nothing. I\u2019ll let you know if my next article is written by a computer.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">BONUS CHALLENGE: Can you tell real\/human-made images from DALL\u00b7E 2 generations? Submit your answers <a href=\"https:\/\/docs.google.com\/forms\/d\/e\/1FAIpQLSeU-SWrwnws77cUkRQgzY-CFCzj0qsvv4scaAuPuEFUZFq79A\/viewform?usp=sf_link\">in this Form<\/a>! Winners will be announced in the next edition!<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p class=\"wp-block-paragraph\">Bibliography: <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Chen, James. \u201cWhat Is a Neural Network?\u201d Investopedia, Investopedia, 22 Sept. 2022, https:\/\/www.investopedia.com\/terms\/n\/neuralnetwork.asp.<br>Fein, Daniel. \u201cDall-E 2.0, Explained.\u201d Towards Data Science, 16 May 2022, https:\/\/towardsdatascience.com\/dall-e-2-0-explained-7b928f3adce7.<br>OpenAI. \u201cDall\u00b7E 2.\u201d OpenAI, OpenAI, 14 Apr. 2022, https:\/\/openai.com\/dall-e-2\/.<br>OpenAI. Terms of Use, 20 July 2022, https:\/\/labs.openai.com\/policies\/terms.<br>Ramesh, Aditya. How DALL\u00b7E 2 Works, http:\/\/adityaramesh.com\/posts\/dalle2\/dalle2.html.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>At the end of this article, you will find the link to a Real or Fake game. Will you be able to distinguish&#8230;<\/p>\n","protected":false},"author":33,"featured_media":574,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5,1],"tags":[],"class_list":["post-573","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-science-nature","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/blueandread.asbarcelona.com\/index.php?rest_route=\/wp\/v2\/posts\/573","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blueandread.asbarcelona.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blueandread.asbarcelona.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blueandread.asbarcelona.com\/index.php?rest_route=\/wp\/v2\/users\/33"}],"replies":[{"embeddable":true,"href":"https:\/\/blueandread.asbarcelona.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=573"}],"version-history":[{"count":9,"href":"https:\/\/blueandread.asbarcelona.com\/index.php?rest_route=\/wp\/v2\/posts\/573\/revisions"}],"predecessor-version":[{"id":593,"href":"https:\/\/blueandread.asbarcelona.com\/index.php?rest_route=\/wp\/v2\/posts\/573\/revisions\/593"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blueandread.asbarcelona.com\/index.php?rest_route=\/wp\/v2\/media\/574"}],"wp:attachment":[{"href":"https:\/\/blueandread.asbarcelona.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=573"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blueandread.asbarcelona.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=573"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blueandread.asbarcelona.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=573"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}