216 min read

Imposter Syndrome

What is Imposter Syndrome?

Impostor syndrome (also known as impostor phenomenon, fraud syndrome or the impostor experience) is a concept describing individuals who are marked by an inability to internalize their accomplishments and a persistent fear of being exposed as a “fraud”

What brought me to blog / explore about this subject is that, quite often if not everyday I feel like this feeling on being inadequate. I made Twitter my bedtime story. I always get excited to find new things that I potentially can learn and immediately sense of sadness when my self-doubt kicks in - that I am actually not good enough to accompolish it.

Then I came across this tweet:

So I am not alone then??

I responded to her tweet. And it occured to me - how do others handle this Imposter Syndrome what are their takes on it?

*Exploing tweets that contain the word “Imposter Sydrome”

## search for 18000 tweets using the rstats hashtag - I totally got this post from one of Emily Robinson's post on RStudio website - Thank you kindly :) - oh I changed the search criteria obviously.

rt <- search_tweets(
  "imposter syndrome", n = 18000, include_rts = FALSE
)

## plot time series of tweets
ts_plot(rt, "3 hours") +
  ggplot2::theme_minimal() +
  ggplot2::theme(plot.title = ggplot2::element_text(face = "bold")) +
  ggplot2::labs(
    x = NULL, y = NULL,
    title = "Frequency of imposter syndrome tweets",
    subtitle = "Twitter status (tweet) counts aggregated using three-hour intervals",
    caption = "\nSource: rtweet package"
  )

rt <- lat_lng(rt)

Where are these tweets from?

And this is something effect people everywhere too? Not all tweets are geo-coded….so many really there are more data points.

## Warning in validateCoords(lng, lat, funcName): Data contains 1851 rows with
## either missing or invalid lat/lon values and will be ignored
  • I would like to extract more information on tweets so I am going to use “TidyText” package for tokenisation.
## do a bit to tidy with Regex

drop_pattern <- "https://t.co/[A-Za-z\\d]+|http://[A-Za-z\\d]+|&amp;|&lt;|&gt;|RT|https|ht"
unnest_pattern <- "([^A-Za-z_\\d#@']|'(?![A-Za-z_\\d#@]))"


df3 = df2 %>% 
  mutate(text_clean = stringr::str_replace_all(text, drop_pattern, "")) %>%
  unnest_tokens(word, 
                text, 
                token = "regex", 
                pattern = unnest_pattern) %>% anti_join(custom_stop_words)
## Joining, by = "word"
##word count

df3 %>% count(word, sort = TRUE)
## # A tibble: 7,293 x 2
##    word         n
##    <chr>    <int>
##  1 syndrome  1875
##  2 imposter  1863
##  3 feel       151
##  4 people     139
##  5 feeling    114
##  6 time       100
##  7 amp         98
##  8 real        94
##  9 beat        92
## 10 person      82
## # ... with 7,283 more rows

Sentiment Analysis

##I choose Bing lexico because I want to do a straight comparison between positive / negative sentiment.

bing = get_sentiments("bing")
bing_rt = df3 %>% inner_join(bing) %>%
  count(word, sentiment, sort = TRUE) %>%
  ungroup()
## Joining, by = "word"
bing_rt
## # A tibble: 808 x 3
##    word       sentiment     n
##    <chr>      <chr>     <int>
##  1 syndrome   negative   1875
##  2 hard       negative     53
##  3 suffer     negative     52
##  4 doubt      negative     45
##  5 love       positive     41
##  6 confidence positive     40
##  7 struggle   negative     37
##  8 fear       negative     36
##  9 anxiety    negative     32
## 10 bad        negative     32
## # ... with 798 more rows
##Plot data using ggplot2 for initial explorary analysis - top 10 words for Positive and Negative sentiments.

library(ggplot2)

bing_rt %>%
  group_by(sentiment) %>%
  top_n(10) %>%
  ungroup() %>%
  mutate(word = reorder(word, n)) %>%
  ggplot(aes(word, n, fill = sentiment)) +
  geom_col(show.legend = FALSE) +
  facet_wrap(~sentiment, scales = "free_y") +
  labs(y = "Contribution to sentiment",
       x = NULL) +
  coord_flip()
## Selecting by n

##Can wordcloud give it more distinct visual comparison?

library(reshape2)
## 
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
## 
##     smiths
library(wordcloud)
## Loading required package: RColorBrewer
df3 %>%
  inner_join(get_sentiments("bing")) %>%
  count(word, sentiment, sort = TRUE) %>%
  acast(word ~ sentiment, value.var = "n", fill = 0) %>%
  comparison.cloud(colors = c("gray20", "gray80"),
                   max.words = 100)
## Joining, by = "word"

What are things that mentioned in the context of Imposter Syndrome? I use “widyr” to help me explore it further.

word_pairs <- df3 %>% 
  pairwise_count(word, id, sort = TRUE, upper = FALSE)

word_pairs
## # A tibble: 95,804 x 3
##    item1    item2         n
##    <chr>    <chr>     <dbl>
##  1 imposter syndrome 1786  
##  2 imposter feel      127  
##  3 syndrome feel      126  
##  4 imposter people    120  
##  5 syndrome people    120  
##  6 imposter feeling   100  
##  7 syndrome feeling    99.0
##  8 syndrome beat       90.0
##  9 imposter real       86.0
## 10 syndrome real       86.0
## # ... with 95,794 more rows
set.seed(1234)


word_pairs %>%
  filter(n >= 30) %>%
  graph_from_data_frame() %>%
  ggraph(layout = "fr") +
  geom_edge_link(aes(edge_alpha = n, edge_width = n), edge_colour = "cyan4") +
  geom_node_point(size = 5) +
  geom_node_text(aes(label = name), repel = TRUE, 
                 point.padding = unit(0.2, "lines")) +
  theme_void()

Conclusion

I understand that how I feel isnt unique to me and a lot of people also are reaching out for supports. Self-compassion is important too. As one of favourite authors Jack Kornfield says

“If your compassion does not include yourself, it is incomplete.”

Caption for the picture.

Caption for the picture.