Quantcast
Channel: Pensieve
Browsing all 35 articles
Browse latest View live

Making Sense | Republic of Lies

In this episode of the podcast, Sam Harris discusses President Trump’s failure to concede the 2020 presidential election. Keeping my political leaning aside, I am astonished to learn that norms and...

View Article


Project Euler | Maximum Sum Traversing Top To Bottom In A Triangle

The 18th and the 67th problems in Project Euler are one and the same. The only difference is in the input test case. The problem 18 has a smaller input and 67 has a large input. For the explanation...

View Article


Project Euler | The Millionth Lexicographic Permutation Of The Digits

The 24th problem of Project Euler wanted the one-millionth lexicographic permutation of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9. If all of the permutations are listed numerically or alphabetically,...

View Article

Writing UDF To Parse JSON In Hive

Sometimes we need to perform data transformation in ways too complicated for SQL (even with the Custom UDF’s provided by hive). Let’s take JSON manipulation as an example. JSON is widely used to store...

View Article

SPOJ | NICEDAY — The Day of the Competitors

Problem Contestants are evaluated in 3 competitions. We say that: A contestant A is better than B if A is ranked above B in all of the three competitions, they were evaluated in.A is an excellent...

View Article


Parse Json in Hive Using Hive JSON Serde

In an earlier post I wrote a custom UDF to read JSON into my table. Since then, I have also learnt about and used the Hive-JSON-Serde. I will use the same example as before. { "customer": {...

View Article

Writing Into Dynamic Partitions Using Spark

Hive has this wonderful feature of partitioning — a way of dividing a table into related parts based on the values of certain columns. Using partitions, it’s easy to query a portion of data. Hive...

View Article

Rise of Skywalker | Balance Restored?

I watched the final part of the Skywalker saga, and overall liked it. It had plenty of elements to invoke nostalgia and make me feel emotionally comfortable. However, I think the ending could have...

View Article


Making Sense | Is Life Actually Worth Living?

This was an interesting discussion. It was a learning opportunity for me, as I have no prior opinion on anti-natalism. David has a unique perspective where he assigns different values to creation of...

View Article


Image may be NSFW.
Clik here to view.

Bowen Lookout and Yew Lake Loop

View from Bowen Lookout We had a fun Saturday, exploring live music scenes in Vancouver. The top recommendation in the group was Guilt & Co, which turned out to be an excellent pick! We stayed for...

View Article

1-2 Oblivious Transfer

I learnt about oblivious transfer when reading up about garbled circuits. As an engineer this feels like a fascinating, almost magical protocol. And it’s not just me. Everyone I talked to about this...

View Article

Scrapy | Crawl WhoScored For Football Stats

Earlier, I have written code to crawl Google Play, iTunes AppStore and Goal.com websites. But every time I re-wrote the code to get content from website, parse it using BeautifulSoup while maintaining...

View Article

Reusing Hive Scripts

Amazon’s Elastic Data Pipeline does a fine job of scheduling data processing activities. It spawns a cluster and executes Hive script when the data becomes available. And after all the jobs have...

View Article


CamelCase Partition Column is a Bad Idea in Hive

Outside Java code I prefer snake_case over camelCase. This is mostly a preference without any strong good reason: Without a proper IDE I find it easier to read snake_case words than camelCase...

View Article

Always Specify Region When Calling DynamoDb from Hive

DynamoDb is a key-value storage store. One can query DynamoDb tables from Hive using the DynamoDBStorageHandler. It’s super easy to setup. Let’s say we have built a platform that collects data for...

View Article

Browsing all 35 articles
Browse latest View live