Skip to content

Added task 3374 #1880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
3374\. First Letter Capitalization II

Hard

SQL Schema

Table: `user_content`

+-------------+---------+
| Column Name | Type |
+-------------+---------+
| content_id | int |
| content_text| varchar |
+-------------+---------+
content_id is the unique key for this table. Each row contains a unique ID and the corresponding text content.

Write a solution to transform the text in the `content_text` column by applying the following rules:

* Convert the **first letter** of each word to **uppercase** and the **remaining** letters to **lowercase**
* Special handling for words containing special characters:
* For words connected with a hyphen `-`, **both parts** should be **capitalized** (**e.g.**, top-rated → Top-Rated)
* All other **formatting** and **spacing** should remain **unchanged**

Return _the result table that includes both the original `content_text` and the modified text following the above rules_.

The result format is in the following example.

**Example:**

**Input:**

user\_content table:

+------------+---------------------------------+
| content_id | content_text |
+------------+---------------------------------+
| 1 | hello world of SQL |
| 2 | the QUICK-brown fox |
| 3 | modern-day DATA science |
| 4 | web-based FRONT-end development |
+------------+---------------------------------+

**Output:**

+------------+---------------------------------+---------------------------------+
| content_id | original_text | converted_text |
+------------+---------------------------------+---------------------------------+
| 1 | hello world of SQL | Hello World Of Sql |
| 2 | the QUICK-brown fox | The Quick-Brown Fox |
| 3 | modern-day DATA science | Modern-Day Data Science |
| 4 | web-based FRONT-end development | Web-Based Front-End Development |
+------------+---------------------------------+---------------------------------+

**Explanation:**

* For content\_id = 1:
* Each word's first letter is capitalized: "Hello World Of Sql"
* For content\_id = 2:
* Contains the hyphenated word "QUICK-brown" which becomes "Quick-Brown"
* Other words follow normal capitalization rules
* For content\_id = 3:
* Hyphenated word "modern-day" becomes "Modern-Day"
* "DATA" is converted to "Data"
* For content\_id = 4:
* Contains two hyphenated words: "web-based" → "Web-Based"
* And "FRONT-end" → "Front-End"
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# #Hard #Database #2024_12_06_Time_261_ms_(84.21%)_Space_66.3_MB_(17.89%)

import pandas as pd

def capitalize_content(user_content):
user_content['converted_text'] = (user_content.content_text.apply(lambda x: x.title()))
return user_content.rename(columns={'content_text': 'original_text'})
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
import unittest
import pandas as pd

# Embed the script
def capitalize_content(user_content):
user_content['converted_text'] = (user_content.content_text.apply(lambda x: x.title()))
return user_content.rename(columns={'content_text': 'original_text'})

# Test suite
class TestCapitalizeContent(unittest.TestCase):

def test_normal_case(self):
# Input data
data = {
'content_id': [1, 2],
'content_text': ['hello world', 'python programming']
}
df = pd.DataFrame(data)

# Expected output
expected_data = {
'content_id': [1, 2],
'original_text': ['hello world', 'python programming'],
'converted_text': ['Hello World', 'Python Programming']
}
expected_df = pd.DataFrame(expected_data)

# Test
result = capitalize_content(df)
pd.testing.assert_frame_equal(result, expected_df)

def test_hyphenated_words(self):
# Input data
data = {
'content_id': [1],
'content_text': ['well-known fact']
}
df = pd.DataFrame(data)

# Expected output
expected_data = {
'content_id': [1],
'original_text': ['well-known fact'],
'converted_text': ['Well-Known Fact']
}
expected_df = pd.DataFrame(expected_data)

# Test
result = capitalize_content(df)
pd.testing.assert_frame_equal(result, expected_df)

def test_mixed_case(self):
# Input data
data = {
'content_id': [1],
'content_text': ['QUICK-brown FOX']
}
df = pd.DataFrame(data)

# Expected output
expected_data = {
'content_id': [1],
'original_text': ['QUICK-brown FOX'],
'converted_text': ['Quick-Brown Fox']
}
expected_df = pd.DataFrame(expected_data)

# Test
result = capitalize_content(df)
pd.testing.assert_frame_equal(result, expected_df)

def test_empty_input(self):
# Input data
df = pd.DataFrame(columns=['content_id', 'content_text'])

# Expected output
expected_df = pd.DataFrame(columns=['content_id', 'original_text', 'converted_text'])

# Test
result = capitalize_content(df)
pd.testing.assert_frame_equal(result, expected_df)

def test_special_characters(self):
# Input data
data = {
'content_id': [1],
'content_text': ['C++ Programming']
}
df = pd.DataFrame(data)

# Expected output
expected_data = {
'content_id': [1],
'original_text': ['C++ Programming'],
'converted_text': ['C++ Programming']
}
expected_df = pd.DataFrame(expected_data)

# Test
result = capitalize_content(df)
pd.testing.assert_frame_equal(result, expected_df)

if __name__ == '__main__':
unittest.main()
Loading