A Case Study on Domain-Specific Fine-Tuning: The Moby Dick Experiment

In today’s dynamic world, Generative AI plays a pivotal role in content creation, customer engagement through chatbots, and providing AI-generated insights for data-driven decision-making across various industries. It produces human-like content in text, images, and music, making it particularly valuable in domains like marketing, advertising, and content creation.

AWS Jumpstart, part of Amazon SageMaker, streamlines machine learning model training and deployment. AWS recently demonstrated its flexibility by fine-tuning the GPT-J-6B model for finance using Jumpstart [1], showcasing its adaptability for domain-specific applications. Inspired by the buzz around generative AI and AWS SageMaker’s features, we explored Jumpstart’s potential by training a model on the unconventional dataset of Herman Melville’s “Moby Dick” [2].

Accessing SageMaker JumpStart from Studio UI

SageMaker Studio provides us with a great UI to access various models and solutions from SageMaker JumpStart
In SageMaker JumpStart, you can access many foundation models for text and image generation along with many other models and end-to-end machine learning solutions that can be deployed by 1-click.

The Moby Dick Experiment

For the purposes of this case study, we used the GPT-J 6B text generation model. JumpStart provides you with the option of deploying the pre-trained model to an endpoint for inference and/or train the pre-trained model with your own data and/or access a Jupyter Notebook to programmatically access the model with the SageMaker API’s.
The experiment involved several key steps:
Data Extraction: We have extracted text data from Gutenberg open project [2] and ensured it was suitable for fine-tuning.
Fine-Tuning Process: Using AWS Jumpstart, we have fine-tuned the GPT-J-6B model to the language and themes of the novel. We’ve used the Notebook provided by the JumpStart solution and modified a few parameters used to query the model. The dataset and code is publicly available in our GitHub.

Results and Implications: For the text generation experiment, we used the following inference parameters to achieve our results: max_length of 250 (number of tokens returned), num_return_sequences of 1 (number of response text examples), top_k of 150 (pick from one of the top 150 possible continuations), top_p of 0.8 (select only from continuations that are at least 80% probable), do_sample is True (sampling), and temperature of 1 (high temperature is greater randomness in responses).

Below compares the results between a pre-trained GPT-J and a domain based fine-tuned GPT-J that was fine-tuned using AWS SageMaker Jumpstart for a text completion task.

What motivates Captain Ahab’s relentless pursuit of Moby Dick throughout the novel?
Pre-trained Model Answer:
Ahab has driven his family from their home. His family, and most of his crew have died at sea or died as prisoners of war. Ahab, alone in a universe without God, becomes the devil himself. He is the only one who sees himself as the only one in the universe.
Ahab is a character that is a man of few friends, if any at all. Most of the time he is a solitary and angry person, except for the time he spends with the crew of the Pequod. It is this crew that is his best friend, and family.
Ahab has been given the name of white whale and is a great enemy of white whale. He is seen as the only one to ever find white whale in his nature.
Ahab and the white whale are one of the most important character in Moby Dick. The white whale is a metaphor for God and for white whale represents Ahab’s hatred for God, as he says in the novel that the white whale killed him as the devil.
Ahab’s quest to kill the white whale was for revenge and
Fine-tuned Model Answer:
In this chapter, Ahab’s pursuit of Moby Dick is described. His primary motive seems to be vengeance for the White Whale’s murderous rampage on the Indian ship, the Pequod, and its crew. But there is more to it than that. As the White Whale has seemingly marked him for death, Ahab seems to be in a perpetual state of near-mutual annihilative fixation with the monster. As Ahab once told Starbuck, he would rather forget he had ever known the world, than know that he had ever forgotten the White Whale. As Ahab is consumed with the monomania to capture or kill the monster, the monster is consumed with the monomania to evade and elude him. This mutual monomania constitutes their life-and-death struggle. It is a queer coexisting monomania that exists between these two enemies; and it is this coexisting monomania that makes them play such an implacable game of blindman’s-bluff with each other, and with themselves. For the more Ahab chases and pursues the whale, the more the whale eludes him;
Analyze the character of Queequeg and his role in the novel named Moby Dick. What does he represent?
Pre-trained Model Answer:
what does he represent to his character and the novel? How does this reflect upon the concept of the American Dream?: what does this concept mean in the context of the book? Does the book or any of the characters seem to be like what we are all too familiar with? Discuss with other students and learn more about the concept of “The American Dream” and what the book is about. Themes of the book? Discuss with your class what you learned from the novel and how this relates to your personal life and how it may relate to you.
Fine-Tuned answer:
Is he good or evil?
“I will now proceed to analyze the character of Queequeg, and his role in the book named Moby Dick. I will begin by saying that he represents an ancient and honorable nation, and I will call him Queequeg, the Fourth. He is a native of the central region of the continent, and is thus a denizen of the land of the Fourth. He has always been faithful to the Fourth, and he will be found faithful to it in this, his latest adventure. I will now proceed to give a more full analysis of Queequeg. First, he is an honorable man. Second, he is a high-minded man. Third, he is a man of faith. Fourth, he is a man of his word. Fifth, he is a man who always keeps his promises. Sixth, he is a generous man. Seventh, he is an uncompromising man. Eighth, he is a man who never changes his beliefs or his ideals. Ninth, he is a man who never fights but if it is perfectly just and righteous.
Analyze the character named Ahab and his role from the novel named Moby Dick?
Pre-trained Model Answer:
Analyze the character named Ahab and his role from the novel named Moby Dick?: The Whale, Part 2
How would you evaluate his role in the novel Moby Dick
How would you interpret his role
His role in the novel?
Can you think about it as
A character
In his role and what role
How would you analyze his role in the novel?
Would you like to give me your opinion?
Is it true that he is
It’s been discussed in the novel?
Yes, I mean
It’s been discussed?
You know, it’s in the novel, and it’s been
What role that has been discussed?
In the novel, you have to analyze?
What are the points that are discussed
in the novel?
Yes, but you can say
In the novel
In the novel
Yes, you can discuss it, but
You can have a different opinion.
A, but you can
think about it, can you can discuss it, but you can have a different opinion about it, so
But if I can have different opinion about it?
Can you have a different
What is
Fine-Tuned Model Answer:
Ahab is a very powerful man. In the novel, he is the captain of a whaling ship called the Pequod. His character has many different aspects. He is a man who wants to kill Moby Dick, a white whale that he sees as his mortal enemy. Because of this, he has many different personality traits. One of the traits is that he is very controlling. He is the captain of the ship Pequod. Because of this, he is the one who makes all the decisions about what the ship will do. Also, he has a lot of authority over the crew. Because of all of this, he can be very demanding of his crew. He has a lot of authority over them, and he can be very demanding.
Ahab is also very self-centered. He is very focused on himself and what he wants. This is why he is so controlling. He wants to control everything that is going on on the ship. He wants to control the crew, the ship, and the whale. He wants everything to go his way. He has

In the examples above, the fine-tuned model demonstrated enhanced relevance when generating text related to the literature domain.

In conclusion, AWS SageMaker is a versatile and invaluable resource for building Generative AI solutions that align with specific business requirements. By providing access to pre-trained models, fine-tuning capabilities, and powerful infrastructure, SageMaker empowers businesses to harness the creativity and efficiency of Generative AI. This technology adds substantial value to enterprises by automating content generation, improving customer interactions, enhancing decision-making processes, and ultimately boosting competitiveness. With SageMaker, businesses can unlock the full potential of Generative AI to innovate, streamline operations, and thrive in an increasingly data-driven and AI-powered world.

The dataset and code is publicly available in our GitHub.

Published On: September 29th, 2023