
This is the first in a number of posts.
In an August 2021 Washington Post article, Hironobu Sakaguchi and Nobuo Uematsu discussed their work on Fantasian, which was about to receive its final update. Although Fantasian was an online IOS game, the collaboration allowed Sakaguchi and Uematsu to reconnect with their original approach to making RPGs.
Sakaguchi and Uematsu are two of the oldest and most important influences behind the Final Fantasy series. Both were involved in the first three entries on the NES (‘87-‘90) and both were present and active all the way through Final Fantasy X (2001).
Gamers who were hooked in those early years probably noticed a few common elements. No early Final Fantasy story was sequential with any other but there were many recurring story elements. Storytelling shared the foreground with gameplay. Since Final Fantasy was the most visible face of the Japanese RPG in America, many Americans associate Final Fantasy with separate battle and navigation screens. There was something else, though, that’s not so easy to summarize.
When FFIV came out on the SNES, the chibi art style probably excited little comment. It made sense that Square would rely on its last reference point from the NES. FFV still had chibis, but now the chibis had facial expressions and body language. Mega Man and Mario pulled off huge visual rehauls with the jump to 16 bits. Final Fantasy played it safe, with the increased graphical capabilities used to build on what came before. The simple sprites became more doll-like, with facial features reminiscent of anime. IV, V and VI used the 16 bit graphics for enemy sprites and backgrounds during the combat screen, which looked either painted or drawn. All of your player characters were still chibi dolls. These specialized uses of complimenting art styles even lasted until the move to the PlayStation. Between VII and IX, the battle screens were filled with polygons, along with the “overworld” section. The exploration screen now had polygon characters against a more detailed pre-rendered background.
Many of those qualities disappeared after X, when Nobuo Uematsu and Hironobu Sakaguchi began to step back.

In the Washington Post article, Sakaguchi and Uematsu discuss Fantasian as a return to their JRPG roots. This game was developed in 2014 and the contemporary software was once again used to build on their traditional approach to JRPG storytelling.
Hand-made diaramas were photographed for environments containing the doll-like, polygonal characters. When talking about his recent play through of FFVI, he compared the art style of early FF to a puppet show.

Think about the tone of some of those early to mid FFs. Particularly IV and VI. Themes of wartime atrocity, mental illness, suicide and the end of the world stand side by side with moon rabbits looking for their calling and a pun-loving octopus. Whimsy and tragedy co-exist easily in non-literal storytelling. The same flexibility that enables erratic tone shifts also enables some unexpected emotional blindsides. Final Fantasy VI was the first to deviate from the traditional swords-and-sorcery subject matter but Final Fantasy VII brought the puppet show into 3D.

Final Fantasy VIII had a futuristic story with a heavy anime influence. IX played it safe with Jim Hensen/Henry Selick-like fantasy world. X was a meeting between the old and new guard. Final Fantasy VII was a fifty-fifty split between the traditional puppet show aesthetic and the later variations.
The world-building of VII is only slightly more daring than VI. The main variation is in its complexity. VII is also less interested in a traditional fantasy origin story: human society, in VII, is divided on how to interpret history. Which made it feel a little more modern than VI. FFVII had whimsy but nothing on the level of Namingway in IV or Ultros in VI.

The use of the chibi-doll polygons against the more detailed pre-rendered backgrounds brought a level of surrealism. When I first played FFVII on the PC around 2000, there was a glitch in the opening FMV and one of the chibi train attendants was briefly superimposed over the crowded streets of Midgar. As the camera rose over the cityscape, the train attendant who looked like a doll ran offscreen.

At first, I thought this was intentional. I had played Super Smash Bros. recently which revolved around a magical glove that brings toy Nintendo characters to life. Toy-based metafiction was precedented in game design, even before Smash. The glitch never repeated, but it did suggest to me that there were actual human characters here represented with symbolic toys. Other things, like the combat system (which is obviously not a literal representation of what is going on) backed this up.

The varying art styles in the FMVs are a major reason why the Washington Post article rang true to me. Fully animated cut scenes have no function other than supporting a narrative. Their purpose is identical to flavor text. In a high-stakes move to a new platform with an unprecedented Western ad campaign, Square was limited only by their imaginations and hardware. The decision they made was to have some cut scenes with chibi dolls and other cut scenes with more realistically-proportioned characters.
I’ve always remembered the scene with Barret comforting Tifa after Cloud falls through the suspended structure over Sector 6. It has an almost Rankin/Bass stop-motion quality. Tifa’s escape from Junon to the Highwind also had chibi dolls.

There were also interesting moments when the dialogue boxes fleshed out details of more intimate moments. Things that couldn’t be depicted with the chibi dolls, like Jessie rubbing the soot off of Cloud’s face or Barret’s whiskers scratching his daughter when he cuddles her. The normalization of these smaller, non-literal emotional beats establishes believability for more serious moments later on, such as the Nibelheim flashback. Even the more comically awkward scenes like Cloud’s cross-dressing infiltration benefited from this.
This also strengthened the immersive quality of the dialogue boxes: it’s easy to hear the character’s voices in your imagination when you’ve already accepted that there are more intimate, human events that exist whether or not you see them. The pathos of the non-literal character interactions also brought dramatic weight to the story’s larger-than-life scale.

Critics of remaking FFVII across multiple games overlook this. The puppet show’s distance from reality opens a wider scope for storytelling. By using graphics to establish symbols rather than direct representations, there is less of a need to let the ordinary unfolding of life and physics bog down the narrative. If Final Fantasy VII was ever going to be remade as a modern video game with realistic or cinematic graphics, it would have to be a very different story…or find another way to convey its scope. To tell a story with a realistic sense of scope, breaking the story into multiple games is the best way to cover every point of faithfulness and give it all room to breathe.
But none of those cinematic, hyper-realistic games will have the same tone. Motion-capture and granular texturing directly effect how the tone informs the scope of the story. Everything would rely on a sense of human physical proportion.

The way in which the puppet show aesthetic exploited the intersection between tone and scale even has a relationship with the literary genre referenced in the name.
I know there are innumerable different opinions on what constitutes any genre. But I believe that fantasy is defined by a relationship with mythology. More than swords and sorcery, more than treasures of the elements and magic swords, more than races of supernatural creatures. The power of fantasy is channeled through mythology.

J.R.R. Tolkien, C.S. Lewis, Lord Dunsany, J.M. Barrie and every other foundational fantasy writer were all aware of this. H.P. Lovecraft was aware of it and tried to incorporate this mythic influence into his own work. Tolkien, Dunsany and Lovecraft were so smitten with the desire to capture the language and tone of ancient texts that they became famous for being dry. In high school, I had a classmate who said that The Fellowship of the Ring was accessible as historical fiction, The Two Towers felt like historical fiction with heavy ancient world atmosphere and The Return of the King was “the Old Testament.”
While Tolkien emulated the tone of ancient poetry and epics, C.S. Lewis coordinated his relationship mythology less directly. He insisted that The Chronicles of Narnia was not a Christian allegory: it was a depiction of a world that ran parallel to his Christian world view. Aslan was not a symbolic representation of Christ; Aslan was literally Christ in the world of Narnia. To use a concept from a separate religious tradition, Aslan could be described as an “emanation” of Christ. Lewis’ Space Trilogy dealt with other worlds that exist before and after their respective Falls from grace in their respective Edens.
Lovecraft wanted to capture a sense of classical authenticity denying us cosmic validation. A voice from the past informing the present that the search for meaning is doomed to fail. While Hans Christian Andersen wrote fairy stories from his imagination, his work reflected the influence of both European folklore and Christianity.
I’ve always suspected that fantasy storytellers are motivated by a personal relationship with mythology. And mythology is our oldest storytelling tradition of dealing with the unknown and what matters most. At the same time, they are not reducible to an allegory or a metaphorical treatise. The first humans to hear the first creation stories did not think that they were listening to imagination or metaphor. Many modern fantasy readers and writers (like myself) don’t think the value of fantasy can be reduced to anything pragmatic. A good artist works with the outside world, so it makes sense to incorporate things like social commentary and matters of personal belief and observation. Those are things that people relate to and they are some of the building blocks of good storytelling. But no single one of those dimensions captures the essential value.

On some level, we still hear literal truth within mythology.
Or, perhaps more accurately, we hear experienced truth, and no experience is reducible to a single specific meaning. Meaning is an effect of experience, not a cause.
Many ancient myths, to modern readers, are simple stories. Things can be deep and powerful while being simple. A good pop-rock musician can make three to four minutes do a lot of work. Simplicity is probably one of the oldest qualitative benchmarks in the history of creativity.
High artistic benchmarks usually have a high failure rate, though. And fantasy is simultaneously one of the most beloved and most derided literary genres. Opinions tend to cluster into child’s play, garbage or the highest of the high.
Final Fantasy itself is a good example of what can go wrong. One of the most common criticisms of the series is that things get complicated. I have nice things to say about the story of XIII, which might put me on thin ice to begin with, but not even I can reconcile the world-building between XIII-2 and Lightening Returns. The story and the cosmology of the first XIII game worked well together. The world-building of the next two games completely ignored each other’s continuity.
World-building minutia can create a sense of authenticity and immersion. But it can just as easily derail the tone of the main story.
FFIV also has cluttered world-building. But it didn’t excite the same exasperation that XIII did among the fan base. The graphical difference between the first SNES Final Fantasy (IV) and the first PS3 Final Fantasy (XIII) necessarily effects the tone. The tonal impact of the graphics is one reason why the science-fiction aesthetic of XIII grated on me the way it did. While scrolling between the stats of your party members, a picture of the relevant character will appear with brief facial movements. The intent was to create the effect of a face seen on a security camera recording immediately before someone “pausing” it. Whenever something happens that resembles magic, there are usually musical cues signaling a tone shift from the futuristic atmosphere. XIII also had a relentlessly serious tone. A dark or dour tone won’t break a story on it’s own but when it’s stacked on top of extremely detailed world-building, the risks add up. In addition to the tone and the world-building, the graphics of the PS3 entangles its sense of physical and emotional scale with human bodies, faces and voices.
It could be argued that a technology-heavy, futuristic setting does not have to draft detailed renders of human characters into a less fluid tone. Wall-E was a computer-animated movie about a sentient AI cleaning robot which kept the tone as whimsical as anything else Pixar did, like Toy Story. Wall-E also waited until the second half of the movie to introduce human characters, though. The robots, with their wildly varying shapes, were allowed to set the tone by being the only characters in the first act.
FFIV may have had a long and complicated story but it also took itself less seriously. Or maybe it’s overall aesthetic made it more approachable.
The game starts with Cecil, a military commander in the fictional nation of Baron, having just raided a village under orders from his king. When he questions the morality of these orders back home, he is punished with a menial delivery task. Upon arrival, the object he was told to carry turns into a magical weapon of mass destruction and levels the surrounding city. Cecil realizes that he has been trapped in a “blood in blood out” arrangement. His opinion no longer matters because he has already shared the guilt of his comrades. In spite of this, the plight of a young girl who was orphaned by his unwitting attack causes him to defect.
He leaves the scene of the carnage with her because he knows his fellow soldiers will likely sweep the area looking for survivors. She fights him and hates him every step of the way. Soldiers of Baron soon try to take both Cecil and the girl, Rydia, into custody, and he fights them off. This is the moment that changes Rydia’s mind about him.
There are a few different ways to take this. Rydia’s mother was not killed in the same wave of destruction that destroyed her home. Rydia belongs to a people called summoners who have symbiotic relationships with magical beings. Before entering the village, Cecil was attacked by a dragon which he succeeded in killing. This dragon was in an entangled symbiosis with Rydia’s mother. Because of Cecil, her mother was dead before he even set foot in her village.
Most people would not easily forgive the person who kills their mother. It also must be said that Cecil did these things unwittingly. He had no way of knowing that the dragon was anything but a dragon or that the package he was delivering would basically explode. On the level of conscious intention, Cecil is innocent, but intentions do not ameliorate trauma. Trauma can also narrow perspective with panic. While fleeing Nazis in WWII, it’s safer to travel with a defecting Nazi than a Nazi true believer. Or maybe the example of his violent insubordination actually convinced Rydia of his commitment to protect her.
Since this is all happening with chibi dolls, it’s easy not to react the same way as you would with a live-action portrayal. The tone doesn’t try to force your empathy. This is not the same as saying it doesn’t matter anyway: there definitely would have been a wrong way to do it. Rydia’s initial hatred and resistance to Cecil makes her eventual acceptance more convincing. More so than it would have been if, for example, she never blamed him for anything. It would have rang equally false if Rydia leapt from her bed and ran to hug Cecil as soon as he fought off the soldiers who were sent to capture them.

The doll-like appearance of the character sprites do not invite visceral empathy or identification. It would have been easy to make it cartoonish. The simple presentation goes over better with more concise dialogue anyway. If your conversations need to be brief, it would be intuitive to lean into melodrama to extract the most value from the shortest amount of space. Instead, after fighting off the soldiers, Cecil tells Rydia that he wouldn’t dare to ask for her forgiveness or affection but he will still do everything he can to protect her. Her reply: “Promise?” This is the first non-combative statement she offers him.
I’m not saying Final Fantasy IV isn’t melodramtic or escapist. A lot of characters appear to die with maximum pathos who turn out to be alive again later. You travel to an underworld filled with dwarves and fairies and even end up on the moon. It’s as escapist as it gets. But FFIV is a better game than it would have been if it leaned into a cartoonish tone to compliment the cartoonish appearance. FFXIII made thorough use of the PS3’s graphics for both spectacle and grittiness. IV balanced it’s appearance with writing, whereas XIII’s writing accommodated the appearance. The result was that XIII appeared more melodramatic to westerners (at least) than the 8-16 bit games.
Balancing cartoonish graphics with text and scenarios that are not cartoonish is a win but it is not the sole strength of the puppet show. There’s something about a lack of physical realism that enables easier mental access to certain things. Anne Rice said that her supernatural novels enabled her to talk more directly about spirituality and philosophy than her realistic ones. The appearance of something like a puppet may be cute, quaint or artsy. They look like simple representations that allow for artistic freedom but not literal truth, so it’s easier for aesthetics to dominate the first impression. If you start with aesthetics, it is a short leap to imagination. With a little bit of emotional realism (rather than visual), non-literal representation can access vast potential.

This is why I find it so easy to be reminded of non-textual allusions throughout the first Final Fantasy VII for the PS1. The game starts in a city called Midgar with two horizontal tiers: the ground and the upper plate. At the beginning, it’s easy to overlook the fact that you are in a mako reactor immediately beneath the upper plate. After y’all blow it up, everyone escapes onto the upper plate and from there they catch a train to their hideout on the ground level.
This is one of only two glimpses of the upper plate in the whole game. And the story basically starts there. The opening cutscene starts with Aerith emerging from an alley in a crowded sidewalk beside an intersection where we briefly run into her after the bombing mission. The opening cutscene makes it visually clear that both Aerith and the route to the train station are on the upper plate but it’s easy to forget; especially since our starting player characters are so ideologically aligned with the people living under the plate.
I remember at least a few fans talking about a scene near the end when the player characters parachute onto Midgar from above as if it were the only time we ever see the upper plate. Apparently, more than one western gamer did not immediately think of the opening scene as taking place on the upper plate. Especially since your main task in the beginning is blowing up a mako reactor, which are tower-like structures between the two plates anyway.

While you’re there in the beginning, though, consider the visual cues. Immediately after your escape, you crawl through a tunnel into an open indoor space with black and white floor tiles and destroyed statues. From there, you emerge into a street beside skyscrapers and strips. It’s still early in the game so it might not be obvious that you would only see things like this on the upper plate. In the pre-rendered backgrounds the shadowplay is directed by fluorescent streetlamps. The general, pervading darkness is suggestive of a night sky. There are giant banners advertising a play called Loveless, a few of the footpaths are cobblestones and the cars look like they came from the forties or fifties. It has a New York-flavored, classic film atmosphere. After this brief passage across the upper plate, the party returns to the slums below by train.
Although the ground-level slums are very different from the upper plate, the disembarking on the train station below still maintains the atmosphere of nighttime urban romance. A young couple happily reunites beside you. You overhear them talking about a separate, abandoned train depot that’s rumored to be haunted. The girl is wearing a leather jacket and punk swag that could have come from the eighties. Cloud arrives at the Seventh Heaven with everyone else and reunites with his childhood friend, Tifa, who apparently got him involved in the bombing to begin with. Cloud and Tifa share an extremely non-literal flashback. We’re in the Sector 7 slums, under a plate, but a brief cut appears to take us near a water tower under a night sky. The adult chibi-dolls are soon replaced by child chibi-dolls. Another cut brings us back to the bar beneath the plate. The player learns, later on, that the flashback depicted something that happened on a separate continent.
During the moment where the setting of the flashback is inhabited by the adult characters, we’re not quite in the memory yet. We’re just seeing adult Cloud and adult Tifa talk about it. Basically, we’re being introduced to a psychological use of environments at the start of the game. Considering the role that belief and delusion play in the rest of the story, this has got to be intentional.

Before this early stage of the game, there are other indications of non-literal storytelling that could be easily overlooked. The game begins with a long credits roll, like a film. The starting screen does not have a logo. The only text are your two options: ‘New Game’ and ‘Continue.’ The only image is Cloud’s buster sword, angled with it’s point downward, surrounded by a spotlight. If you manage to get KO’d, you’ll see a game over screen with a broken strip of film and a film reel canister off to the side. If you see that screen before escaping from the reactor, the old-fashioned cars and cobblestones imply an even more direct classic film aesthetic. The only thing that stops me from making comparisons with noir is that there are too many colors (however subdued).
On this note- when development started on Final Fantasy VII, it was originally planned to take place in twentieth-century New York and would have told the story of a detective. The detective eventually made it into the final game, after many revisions, as the character Vincent Valentine. Square’s New York-based detective concept would later be used for Parasite Eve, which was released very closely to Final Fantasy VII. Parasite Eve was something of a survival-horror game and therefore had a darker tone than Final Fantasy. The police-procedural plot structure and the darker atmosphere landed much closer to noir than FFVII.

Maybe classic film (noir or otherwise) was an early influence in FFVII. Maybe not. I lean toward affirmative. Especially since discovering Vincent, the original detective character, will connect several plot threads. His entrance to the story functions as an arch-clue solving a number of mysteries. To say nothing of the WEAPON monsters later on, which are evocative of the Japanese kaiju movies of the sixties like Godzilla. That last part clinches it for me but I’ll have more to say about that later.
So. The torn film in the game over screen and the buster sword, spotlit as if it was onstage, are tucked into forgettable moments like losing battles and starting the game up. As out-of-the-way as they are, though, they point directly toward a kind of metafiction. When I first played the game on PC, the glitchy train attendant all but convinced me that FFVII was “acted out” with dolls, like Super Smash Bros. There are less direct indications, though, that also point to toy metaphors.

On the train returning everyone to Sector 7, Jesse shows Cloud a digital wire-frame model of Midgar, 1/10,000 scale. Later in the game, we pass by a physical diorama of Midgar in the Shinra Building. There is an odd set of collectible items called 1/35 SOLDIER that look like miniature train-attendant polygons. The Temple Of The Ancients is revealed to be the Black Materia and must be reduced to a size small enough to fit in one’s hand. Cait-Sith repeatedly refers to his body as a toy and that he can shift his consciousness from one toy to another. The instruction booklet for the PS1 FFVII says, in Cait-Sith’s character profile, that he primarily resides inside of the cat and the body the cat rides on is a toy moogle that he “magically brought to life.”
That last one feels directly analogous to Sephiroth’s consciousness shifting between carriers of Jenova’s DNA while his original body is sealed in the center of Gaia. It’s also hard to shake an association with Cait-Sith when Sephiroth, “possessing” one of his clones, refers to the “end of this body’s usefulness.” Then there’s Jenova’s only line of dialogue, telepathically addressed to Cloud, calling him a “puppet.”
One of the strength’s of Sakaguchi’s puppet-inspired design is that it doesn’t immediately draft your visual mind into a literal emotional language. The emotional and psychological dynamics are furnished entirely by dialogue and situations. Depending on preference, this can either completely stop immersion or it could completely immerse you. I found it immersive but then again I’ve never thought it was necessary for video games to emulate film. (Not that it shouldn’t- modern video games can and do succeed at that. I only mean that it is not universally necessary) In a lot of my gaming posts, I’ve talked about how the entire gaming industry jumped on board with voice acting, whether or not it was a good idea for all games. Rather like reading, I’ve always appreciated dialogue-boxes because it puts the voices of the characters directly in your head. For me, the puppet show succeeds in a similar way. Especially in moments like Rydia’s acceptance of Cecil in FFIV, when a few careful writing choices can get you across the distance of abstraction.

I think a lot of the aesthetic references and allusions feel more direct because of the abstraction between the puppet show and the story it tells. It’s a reason why so many thematic bells and whistles in Final Fantasy VII are so close to the surface. It’s why I can’t play through that beginning part without being reminded of old detective movies from the forties and fifties.
BTW- if it seemed like I’m on a noir kick…it’s ’cause I am ^^
One particular trait of noir is relevant here: moral ambiguity.
To simplify the history of film a bit- German expressionism was a close cinematic cousin to noir. Expressionism freely incorporated abstractions on a few different levels- characters that embody and control things like gods and wildly creative painted backgrounds. Expressionist film establishes it’s own internal consistency rather than depending on real-world reference points. If expressionism is set in it’s own psychological world, noir is set in it’s own moral world.
This moral abstraction is most typically established by bleakness. Many detective movies, both then and now, are as gritty as the conventions of the day permit.

Both expressionism and noir depend on an internally-consistent world that attempts to support itself rather than bringing in literal outside reference points. Just like the fantasy genre. Early in A Song of Ice and Fire, George R.R. Martin made sure to include things like “to the Others with X” and “Others take X”. By replacing ‘Hell’ with ‘Others’, he using the structure of common English euphemisms to establish the internal frame of reference of the novels. It’s also evident in one common criticism of The Matrix trilogy: too much in-world jargon. One review said that the scene where the Oracle says that the Keymaker is with the Merovingian is like hearing someone say “the thing said you need the thing which is held by the thing.”
Building your own internal consistency which is separate from the outside world and relatable only by analogy is hard. And like any other art form, brevity and efficiency often have to co-exist with that. Removing the possibility of direct, external reference makes things really simple and, as in so many things, simple benchmarks are often the highest and most difficult.
While fantasy may share the abstraction of expressionism, Final Fantasy includes a noir-like flourish that raises the stakes. And it’s nothing new. It’s the thing that usually gives you something to pay attention to within stories, without which people will say “nothing happened”: conflict.
More specifically, a conflict of meaning. In the most memorable Final Fantasy stories, some conflict of meaning is explored. In IV, Cecil goes from a loyal soldier to a righteous deserter. In VI, Terra starts as an unwilling pawn and goes through a variety of paradigm shifts, including (but not ending with) abandoning the quest for a simple life of good works. Zidane starts his quest as a self-interested thief and Tidus begins as a hormonal teenager trapped between puberty and emotional abandonment. Neither of them end in those places. In all of those games, the moral stakes at the beginning are revealed to be the surface of deeper machinations.
The conflict is made specifically moral by a mistaken or misguided source of power. It could be a feudal monarchy, a religious movement, a political movement or a corporation. Final Fantasy begins with an underdog in a corrupt world and then moves on to the reality that the “corruption” is bending under. At that moment, the main character usually has to re-evaluate their motivations.
https://www.washingtonpost.com/video-games/2021/08/13/final-fantasy-creator-sakaguchi-fantasian/
https://www.thegamer.com/cloud-strife-new-york-final-fantasy-vii-development-concepts/