Introducing Sheet2Slide: Convert tables to slides at ease

Introducing Sheet2Slide: Convert tables to slides at ease

TLDR: Slides aren't the boring thing anymore!

Ever been in a place where you had to sit and prepare slides for hours ?

What if you could do that in seconds ! Let's check that out!

Hey Everyone,
I am Surya Madhavan, a curious developer with ideas in heart and tech in mind.

This week, I wanted to share a simple project I worked on (based on one of my pain point). Consider you have to create a PPT for showcasing some data (in Excel / CSV / JSON format). The manual work increases as the size of data increases. To find a relief, I was wandering around to write something simple in Java (since it's my go-to language at the moment). Surprisingly, there existed a library that does the exact thing - Apache POI !

Features

  • Parse CSV data and populate slides with tables.

  • Add rows dynamically based on row height and pending space in a slide.

  • Add header & footer if provided.

  • setup.properties file to add config details.

  • Logging using SLF4J and Log4J 2 for better experience.

How to ?

To run this application,

  • Clone the repository. (also star the repo ;) )

  • Set the config properties in setup.properties file.

  • Put the CSV files in resources/input folder.

  • Run the SheetToSlideApp program using any IDE.

  • Your generated PPTX files should be under resources/output folder in seconds!

Challenges faced

  • Positioning the table / text box was quite a task, since the positioning is based on a concept called anchor where we initialize a Rectangle/ Rectangle2D object with X-Y coordinates along with its height & width.

  • Units used are not known standards like pixels, inches. It uses an internal conversion class to convert Points to say pixels or some other unit. There's still more to know about units used in Apache POI though.

About Apache POI

Keep reading to understand the fundamentals of this app. So Apache POI is an open source library providing support to perform CRUD operations on Microsoft Office files (Word, Excel, PowerPoint). I will talk about PowerPoint files handling in detail here. Apache POI has different set of classes for the formats - HSLF for .ppt and XSLF for .pptx . I went with XSLF since PPTX is the latest format for PowerPoint.

My understanding of Apache POI classes and its usage

Here's my understanding of the Apache POI design architecture. I have not covered all the classes, but those which were crucial for understanding and building basic scripts using this library.

  • XMLSlideShow - Parent class responsible to create PPT object from scratch / source.

    • XSLFSlideMaster - Subclass containing layouts & extra configurations

      • XSLFSlideLayout - Subclass for setting layout. We can set predefined layouts using SlideLayout class.
    • XSLFSlide - Subclass responsible for each Slide object.

      • XSLFTextShape - This is useful to add a text box and add contents / paragraph.

      • XSLFTextParagraph, XSLFTextRun - These are some useful subclasses to set text and also modify properties like Font Family, Font Size, etc.

      • XSLFTable - Here comes our trump card. We can build a table of defined rows and columns.

        • XSLFCell - We can say, this is atomic level implementation for a XSLFTable. Here's where we populate the cells with instances of Text classes (mentioned above).

For those wanting to try building such apps, here is a cool guide from Apache POI itself.

Any Feedback is appreciated!

Please star the repository and I will be back with another blog / resource soon!